We benchmarked several open-weight Chinese models on FrontierMath. Their top scores on Tiers 1-3 lag the overall frontier by about seven months.
22.12.2025 18:57 β π 8 π 3 π¬ 1 π 0We benchmarked several open-weight Chinese models on FrontierMath. Their top scores on Tiers 1-3 lag the overall frontier by about seven months.
22.12.2025 18:57 β π 8 π 3 π¬ 1 π 0
#IPAM (the institute for pure and applied mathematics) is facing a critical shortfall for operating expenses due to an unexpected suspension of NSF funding www.ipam.ucla.edu/news/nsf-fun... . Donations for emergency continuity of operations funding can be made at
giving.ucla.edu/Campaign/Donat
π Big time! We can finally do simple LLM RL fine-tuning with rewards and leverage offline/off-policy data!
β You want rewards, but GRPO only works online?
β You want offline, but DPO is limited to preferences?
β
QRPO can do both!
π§΅Here's how we do it:
We provide an efficient and performant method that provides best of both worlds in a new architecture. We managed to show that our approach scales better than the SOTA transformers with self-attention.
Incredible execution and attention to details by @xiuyingwei.bsky.social !
Thrilled to announce that our work βFleet of Agentsβ has been accepted @icmlconf.bsky.social. On average, FoA boosts quality by ~5% while reducing costs to ~40% of SOTA baselines. Blog post after the Neurips deadline ;)
Until then:
Paper: arxiv.org/abs/2405.066...
Code: github.com/au-clan/FoA
Many thanks to all amazing collaborators that contributed to this project - Amin Mansouri, @lars-quaedvlieg.bsky.social , Amal Seddas, Maryna Viazovska, Emmanuel Abbe, @caglarai.bsky.social
12/12
Excited to share our latest work on EvoTune, a novel method integrating LLM-guided evolutionary search and reinforcement learning to accelerate the discovery of algorithms! 1/12π§΅
26.04.2025 16:56 β π 21 π 10 π¬ 1 π 2This fantastic work is mainly due to incredibly hard-working students like @anjasurina.bsky.social, @lars-quaedvlieg.bsky.social, and Amin Mansouri. Also, this was my first paper with a Fields medalist, Maryna Viazovska, and the one and only Emmanuel Abbe π.
26.04.2025 17:02 β π 1 π 0 π¬ 0 π 0Wohooπ₯³ Thrilled to announce this paper π’. We have shown that it is possible to significantly improve the FunSearch method with RL and achieve impressive algorithmic discoveries on challenging NP-complete combinatorial optimization tasks like TSP and bin-packing.
26.04.2025 17:02 β π 4 π 0 π¬ 1 π 0
π¨π¨ 24 more hours to register your abstracts for the @grades-nda.bsky.social workshop @sigmod2025.bsky.social
Papers due March 30th 23:59 AoE π
@sdumbrava.bsky.social @olafhartig.bsky.social @csaudk.bsky.social
I am recruiting 2 PhD students for Fall'25 @csaudk.bsky.social to work on bleeding-edge topics in #NLProc #LLMs #AIAgents (e.g. LLM reasoning, knowledge-seeking agents, and more).
Details: www.cs.au.dk/~clan/openings
Deadline: May 1, 2025
Please boost!
cc: @aicentre.dk @wikiresearch.bsky.social
Ahahaha it is still very clear in my as if it was yesterday and it was definitely soju because after you talked with the owner; several plastic bottles of soju showed up on our table. Though I don't remember the rest of night π
09.03.2025 21:24 β π 1 π 0 π¬ 0 π 0If it turns out LLMs are only capable of recombinatory innovation (finding novel connections among existing knowledge), that would still be very useful. Most innovation is recombination and one of the big issues in science is that fields are too vast for scientists to bridge them to find connections
09.03.2025 18:25 β π 172 π 17 π¬ 10 π 4Amazing, could become the next hit π I discovered @kyunghyuncho.bsky.social's amazing singing skills when I first went to karaoke with him in 2012.
08.03.2025 18:09 β π 2 π 0 π¬ 1 π 0www.youtube.com/watch?v=9_Pe... An interview with Rich. The humility of Rich is truly inspiring: "There are no authorities in science". I wish people would listen and live by this.
06.03.2025 20:50 β π 40 π 13 π¬ 2 π 1stay tuned for more proper, detailed and exciting cover of this preprint, but whoa i'm so proud of the team @prescientdesign.bsky.social and our achievements on <Lab-in-the-loop therapeutic antibody design with deep learning>!
25.02.2025 18:09 β π 20 π 5 π¬ 1 π 0And I am an ally. If you are too, let the world know.
22.02.2025 22:14 β π 79198 π 17149 π¬ 1284 π 1008I have been using Glove80 kb in the last week due to my RSI and it improved significantly since then. But I am still baffled how hard it is to get used to a new kb layout. Oddly, although I type perfectly fine on it now, I can't enter my passwords with it because they are stored in my muscle memory.
18.02.2025 20:47 β π 2 π 0 π¬ 0 π 0
Do large language models develop "emergent" models of the world? My latest Substack posts explore this claim and more generally the nature of "world models":
LLMs and World Models, Part 1: aiguide.substack.com/p/llms-and-w...
LLMs and World Models, Part 2: aiguide.substack.com/p/llms-and-w...
Trajan's (@starlord37.bsky.social) story & countless others like it in the face of these cuts and wild shifts in government fellowships intended for our brightest & most promising students will have long-term, deeply damaging effects on the U.S.'s competitiveness in science, math, CS and more.
09.02.2025 04:47 β π 51 π 12 π¬ 2 π 2
A great talk on the history and design decisions in Google's TPUs by my longtime colleague Norm Jouppi, winner of the 2024 Seymour Cray Computer Engineering award.
Talk: www.youtube.com/watch?v=a-1x...
Award announcement: www.computer.org/publications...
Weβve been thrilled by the positive reception to Gemini 2.0 Flash Thinking we discussed in December.
Today weβre sharing an experimental update w/improved performance on math, science, and multimodal reasoning benchmarks π:
β’ AIME: 73.3%
β’ GPQA: 74.2%
β’ MMMU: 75.4%
Google's Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time as presented by one of the author - @alibehrouz.bsky.social
13.01.2025 19:53 β π 70 π 18 π¬ 4 π 5
Also, check out our ML project templateβitβs a game-changer!ππ
@caglarai.bsky.social
π§βπ» github.com/CLAIRE-Labo/...
Ever been puzzled by your PPO agent collapsing out of nowhere? ππ€―π Come check out our poster tomorrow!
Wed 11 Dec 11 am - 2 pm PST
West Ballroom A-D #6403
@caglarai.bsky.social @andreamiele.bsky.social @razvan-pascanu.bsky.social
Looking forward to seeing many of you during the conference!
09.12.2024 23:03 β π 2 π 0 π¬ 0 π 0
2. No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO, West Ballroom A-D #6403
Both are on 11 Dec, 2pm-5pm. EST. I am co-organizing the Pluralistic Alignment workshop on the 14th Dec with a fantastic line of speakers: pluralistic-alignment.github.io (2/3)
I am in Vancouver for NeurIPS 2024 until December 16th if you want to meet, DM or email me.
We have two accepted papers from my lab:
1. Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers, on Wednesday, East Exhibit Hall A-C #2010 (1/3)
phys.org/news/2024-11... #AI #artificialintelligence
16.11.2024 08:17 β π 3 π 1 π¬ 0 π 0