Nathan Lambert's Avatar

Nathan Lambert

@natolambert.bsky.social

A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef Writes http://interconnects.ai At Ai2 via HuggingFace, Berkeley, and normal places

13,538 Followers  |  276 Following  |  1,878 Posts  |  Joined: 30.04.2023  |  1.6661

Latest posts by natolambert.bsky.social on Bluesky

Post image

First time at CMU

13.02.2026 15:35 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Fun to set up real analytics and learn that my RLHF Book pdf is downloaded 50-100 times a day from my site (doesnt include Arxiv downloads/views).

Thanks for reading!

12.02.2026 14:51 β€” πŸ‘ 26    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Codex app is nice.
Im just a few minutes in and think it'll make some of the crazy things i was doing way easier to monitor.

11.02.2026 23:37 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

Poll. Do you see the famous METR plot holding true on Jan. 1st of 2027 (~20 hours), or 2028 (~50 hours).

What would be the right way to measure tasks of that scope?

11.02.2026 17:12 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 4    πŸ“Œ 0
Post image

Beautiful RL scaling plot from Cursor.
cursor.com/blog/compose...

10.02.2026 00:26 β€” πŸ‘ 16    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

TLDR: codex is a very useful coding tool, claude is the first agent.

09.02.2026 15:40 β€” πŸ‘ 26    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Opus 4.6, Codex 5.3, and the post-benchmark era On comparing models in 2026.

I spent a long time testing the new Opus 4.6 and Codex 5.3 models, but the most striking thing was how so many people are reacting to model releases wrong with how we now use models. In my post-benchmark era.

Claude is still king, but codex is closer than ever
www.interconnects.ai/p/opus-46-vs...

09.02.2026 15:21 β€” πŸ‘ 40    πŸ” 4    πŸ’¬ 1    πŸ“Œ 2

People don't want to accept that the top used open model families in 2026 are.

Overall:
1. Qwen
2. Llama
3. GPT-OSS

Big models:
1. DeepSeek
2. GPT-OSS/Qwen/everyone else

Llama's inertia says a lot about how the ecosystem works.

08.02.2026 17:45 β€” πŸ‘ 37    πŸ” 5    πŸ’¬ 3    πŸ“Œ 0

I want there to be a nanoGPT style speedrunning setup for RL.

06.02.2026 19:29 β€” πŸ‘ 29    πŸ” 0    πŸ’¬ 2    πŸ“Œ 1

The best compliment i can give OpenAI's Codex 5.3 is that it feels way more like Claude Code

06.02.2026 18:07 β€” πŸ‘ 54    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1

GPT Codex 5.3 sounds like a much bigger change than Claude Opus 4.6, will be curious if this holds up in real testing.

05.02.2026 18:31 β€” πŸ‘ 35    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

β€œDue to GPT‑5.3-Codex being so different from its predecessors, the data from alpha testing exhibited numerous unusual and counter-intuitive results”

Sounds worth giving a go. Big changes are good.

05.02.2026 18:16 β€” πŸ‘ 16    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

Reward models (RMs) are supposed to represent human values. But RMs are NOT blank slates – they inherit measurable biases from their base models that stubbornly persist through preference training. #ICLR2026 🧡

04.02.2026 16:30 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 1    πŸ“Œ 1

Ending your day at >99% Claude rate limit usage but not maxing out feels like a masterpiece.

05.02.2026 03:32 β€” πŸ‘ 47    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Why Nvidia builds open models with Bryan Catanzaro Interconnects interview #17 on the past, present, and future of the Nemotron project.

Transcript etc: www.interconnects.ai/p/why-nvidia...

04.02.2026 18:05 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Why NVIDIA builds their own open models | Nemotron w/ Bryan Catanzaro
NVIDIA releasing their best models as open weights isn't charity β€” it's a business decision. And honestly, it's one of the clearest explanations I've heard for why a company would invest heavily in… Why NVIDIA builds their own open models | Nemotron w/ Bryan Catanzaro

Nvidia’s Nemotron is the closest thing the U.S. has to a Qwen approach to open models, but most people don’t know it yet.
I’m very bullish on Nvidia’s open model efforts in 2026.
Interconnects interview #17 on the past, present, and future of the Nemotron project.
www.youtube.com/watch?v=Y3Vb...

04.02.2026 18:05 β€” πŸ‘ 35    πŸ” 3    πŸ’¬ 1    πŸ“Œ 2
Post image

Qwen already dropping models for CNY

03.02.2026 17:48 β€” πŸ‘ 77    πŸ” 4    πŸ’¬ 5    πŸ“Œ 0

Gemini not being in the conversation at all with Claude Code and Codex is the real β€œcode red” emergency.

03.02.2026 15:23 β€” πŸ‘ 46    πŸ” 0    πŸ’¬ 12    πŸ“Œ 0

Is documented! I did a full memory sweep. The training becomes FLOP limited before memory saturated.

02.02.2026 20:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Latest open artifacts (#18): Arcee, LiquidAI and Moonshot ... Tons of useful "niche" models and anticipation of big releases coming soon.

Latest open artifacts (#18): Arcee's 400B MoE, LiquidAI's underrated 1B model, new Kimi, and anticipation of a busy month
Tons of useful "niche" models and anticipation of big releases coming soon.
www.interconnects.ai/p/latest-ope...

02.02.2026 15:23 β€” πŸ‘ 16    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
Post image

Despite being banned, Chinese users (likely via VPNs) are HuggingFace's top user group. They definitely have the most people *building* open models.

01.02.2026 17:07 β€” πŸ‘ 28    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0
Preview
Add direct alignment algorithms (DPO, IPO, SimPO, ORPO, KTO) by natolambert Β· Pull Request #226 Β· natolambert/rlhf-book Summary Implements educational direct alignment algorithms for Chapter 12 6 algorithms: DPO, cDPO, IPO, SimPO, ORPO, KTO Default model: allenai/OLMo-2-0425-1B-SFT Default dataset: argilla/ultrafee...

github.com/natolambert/...

01.02.2026 15:41 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

claude code writing, codex code review, GPT Pro for planning made a working DPO (and related algorithms) repository from scratch for my RLHF book, and the curves are looking right.

On the dgx spark finetuning olmo 2 1b sft. Built by referencing the original repositories + TRL

01.02.2026 15:41 β€” πŸ‘ 28    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490
YouTube video by Lex Fridman State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI | Lex Fridman Podcast #490

Recorded a podcast, think it’s pretty good and comprehensive, hope you like it ;) youtu.be/EV7WhVT270Q?...

31.01.2026 23:06 β€” πŸ‘ 38    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1

I'm visiting CMU for a talk at the Language Technologies Institute on feb 12/13th. Looking forward to chatting with folks about frontiers in RL and building agentic language models.

Email me with "CMU Visit" in the subject if you're interested in chatting & why!

31.01.2026 20:03 β€” πŸ‘ 15    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

More people should think about future AIs as part of the audience for their writing (or work).

31.01.2026 16:40 β€” πŸ‘ 40    πŸ” 2    πŸ’¬ 4    πŸ“Œ 4
Preview
Thoughts on the hiring market in the age of LLMs On standing out and finding gems.

My raw thoughts on the job market -- both for those hiring and those searching -- at the cutting edge of AI.
On standing out and finding gems.
www.interconnects.ai/p/thoughts-o...

30.01.2026 15:52 β€” πŸ‘ 40    πŸ” 1    πŸ’¬ 3    πŸ“Œ 3

0-1%

28.01.2026 16:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

More at atomproject.ai

28.01.2026 15:30 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The method for this is taking the top10 open models in terms of total tokens processed on the OpenRouter platform and then normalizing to be a share of 100%.

This assumes the top10 models are most of the usage, which is often true, and adds some noise without the long tail.

28.01.2026 15:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@natolambert is following 20 prominent accounts