Cong Lu's Avatar

Cong Lu

@cong-ml.bsky.social

Research Scientist @ Google DeepMind, in open-ended learning, and AI for Scientific Discovery.

933 Followers  |  463 Following  |  11 Posts  |  Joined: 19.11.2024  |  1.7739

Latest posts by cong-ml.bsky.social on Bluesky


Are you interested in Open-Endedness and AI for Science? πŸ§ͺ

I'm hiring a Student Researcher at Google DeepMind for a 6-month role. Join us to work on building agents capable of novel scientific discoveries! πŸ”¬

Reach out if this sounds like you, and apply here πŸ‘‡

docs.google.com/forms/d/e/1F...

11.11.2025 11:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
StochasTok: Improving Fine-Grained Subword Understanding in LLMs Subword-level understanding is integral to numerous tasks, including understanding multi-digit numbers, spelling mistakes, abbreviations, rhyming, and wordplay. Despite this, current large language mo...

πŸ“„ Paper: arxiv.org/abs/2506.01687
πŸ’» Code: github.com/anyasims/sto...
A massive πŸ™ to my incredible co-authors: Anya Sims, Thom Foster, @klarakaleb.bsky.social, Tuan-Duy H. Nguyen, Joseph Lee, @jfoerst.bsky.social, @yeewhye.bsky.social!

[8/8]

11.06.2025 12:09 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The significant gains from this minimal change are super exciting, and we see huge potential for larger models and more complex tasks like coding, scientific reasoning, and beyond! We invite you to explore the paper and code!

[7/]

11.06.2025 12:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

More major advantages! 🌟

COST-EFFECTIVE: StochasTok allows enhanced subword skills to be seamlessly 'retrofitted' into existing pretrained models - thus avoiding costly pretraining!
ENHANCED ROBUSTNESS: Improves resilience to alternative tokenizations! (see examples)

[6/]

11.06.2025 12:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Empirically, we find:
LANGUAGE: As hoped, StochasTok unlocks language manipulation ability! (see task examples below)
MATH: Furthermore, StochasTok dramatically changes multi-digit addition, enabling grokking and even generalization to UNSEEN TOKENIZERS!🀯

[5/]

11.06.2025 12:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Practically, StochasTok is:
βœ…Computationally lightweightπŸͺΆ
βœ…A simple dataset preprocessing step β€” No training loop or inference time changes required!πŸ› οΈ
βœ…Compatible with ANY base tokenizer β€” Allows us to retrofit pretrained models!πŸ’°
βœ…Robust to hyperparameter choice!πŸ”₯

[4/]

11.06.2025 12:09 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The underlying StochasTok algorithm is extremely simple!

1️⃣ Simply tokenize text with ANY base tokenizer,
2️⃣ Then, stochastically split some of those tokens into equivalent token pairs.

That’s basically it! Repeat step 2 for the desired granularity.

[3/]

11.06.2025 12:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ€”The problem: Standard tokenization gives distinct token IDs for each token - making it unnecessarily hard to learn, e.g., β€˜book’=3092 and β€˜cook’=171691 differ by a single letter.

πŸŽ‰The solution: Allow LLMs to naturally 'see inside' tokens via alternative tokenizations!

[2/]

11.06.2025 12:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸš€Introducing β€œStochasTok: Improving Fine-Grained Subword Understanding in LLMs”!πŸš€

LLMs are incredible but still struggle disproportionately with subword tasks, e.g., for character counts, wordplay, multi-digit numbers, fixing typos… Enter StochasTok, led by Anya Sims!

[1/]

11.06.2025 12:09 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Preview
Β© CBC/Radio-Canada 2025. All rights reserved.

It was an honor to be on Quirks and Quarks (the
CBC science show) with @cong-ml.bsky.social talking about The AI Scientist and the impact of AI on science.

Science is being transformed by the AI revolution
cbc.ca/listen/live-...

14.02.2025 22:26 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

Introducing Automated Capability Discovery!

ACD automatically identifies surprising new capabilities and failure modes in foundation models, via "self-exploration" (models exploring their own abilities).

Led by @cong-ml.bsky.social & @shengranhu.bsky.social
πŸ”¬πŸ€–πŸ§ πŸ”Ž [1/9]

12.02.2025 06:59 β€” πŸ‘ 19    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Post image

It's an honor that The AI Scientist is #1 on this list!

www.linkedin.com/feed/update/...

Congrats @chris-lu.bsky.social @cong-ml.bsky.social @RobertTLange @hardmaru.bsky.social @jfoerst.bsky.social

08.01.2025 18:50 β€” πŸ‘ 23    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Lots of interest in ADAS! Thanks everyone, and congrats
Shengran Hu and @cong-ml.bsky.social! πŸš€πŸš€πŸš€

16.12.2024 18:19 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Honored to receive this award for ADAS!!

16.12.2024 21:33 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our in-progress work Quality-Diversity Self-Play (w/ @cong-ml.bsky.social and @jeffclune.com) will have a poster presentation at #NeurIPS2024 workshops (@IMOLNeurIPS2024 Sunday West meeting room 217 - 219 and OpenworldAgents Sunday East Meeting Room 1-3, Foyer). Please come visit us!

14.12.2024 18:59 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1
Post image

Our work Automated Design of Agentic Systems (w/
Shengran Hu & @cong-ml.bsky.social) will have ✨two orals✨ @ #NeurIPS2024 workshops (LanGame Sat 10:20, OWA Sun 4:50). Please come visit usπŸ˜ƒ

We would also love to chat about open-endedness, LLM agents, etc. Come by if you want to meet!

10.12.2024 21:49 β€” πŸ‘ 12    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Post image

Interested in robust model-based offline RL algorithms? Come check out Anya Sims presenting our new paper investigating the edge of reach problem in offline MBRL!

πŸ“East Exhibit Hall A-C #4603

#NeurIPS2024

12.12.2024 00:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
A new golden age of discovery In this essay, we take a tour of how AI is transforming scientific disciplines from genomics to computer science to weather forecasting. Some scientists are training their own AI models, while...

A great new essay on AI for Science from our colleagues here:

deepmind.google/public-polic...

26.11.2024 13:35 β€” πŸ‘ 22    πŸ” 5    πŸ’¬ 0    πŸ“Œ 1

The RL (and some non-RL folks) starter pack is almost full. Pretty clear that the academic move here has succeeded
go.bsky.app/3WPHcHg

18.11.2024 20:30 β€” πŸ‘ 104    πŸ” 32    πŸ’¬ 12    πŸ“Œ 3

Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD

20.11.2024 07:08 β€” πŸ‘ 105    πŸ” 32    πŸ’¬ 16    πŸ“Œ 5

@cong-ml is following 19 prominent accounts