Allen Nie's Avatar

Allen Nie

@allenanie.bsky.social

Stanford CS PhD working on RL and LLMs with Emma Brunskill and Chris Piech. Co-creator of Trace. Prev @GoogleDeepMind @MicrosoftResearch Specifically - Offline RL - In-context RL - Causality https://anie.me/about Unverified hot takes go to this account

2,334 Followers  |  445 Following  |  33 Posts  |  Joined: 14.11.2024  |  1.9134

Latest posts by allenanie.bsky.social on Bluesky

Check out Tianwei’s latest work on using unlikelihood objective to distill search traces back to base model to boost reasoning capabilities of LLMs!

23.04.2025 23:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For all the RL PhDs and people interested in Planning and MDPs, there's a summer internship opportunity at AWS Science that specializes in LLM post-training, RLHF, LLM agents, and benchmarks like WebArena. Interested students can send their CV to fakoor@amazon.com

07.02.2025 19:52 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For education and psychometrics people, this dataset is very useful!

11.12.2024 07:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I credit Omar @lateinteraction.bsky.social for this beautiful summary of the difference 🀣

11.12.2024 02:14 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hi Tim β€” Trace can optimize the control flow, whereas DSPy optimizes the modules in a fixed control flow (for now) πŸ™‚ I would use DSPy for a supervised learning setup and Trace for an RL-like task (when there’s a clear definition of reward and feedback).

11.12.2024 02:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Trace performs inference-time optimization β€” not directly updating weights of the underlying neural network. It updates the agentic workflow (python functions, prompts to LLMs and etc)

11.12.2024 00:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

People say Ching-an and I are indistinguishable…is that true 🀣

10.12.2024 23:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image Post image

Come check us out near the Tesla Booth in West Exhibition Hall A 3-5pm! Come and claim your mug 🀣 we have an identity crisis β€” people keep thinking we are from IBM for some reason…

10.12.2024 23:05 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We are happy to give a talk or have a 1:1 chat if you are interested in learning what Trace is and/or how to use it! Trace has already been presented at the UW Robotics Colloquium and ServiceNow. #foundermode for Open-Source Software! Time to build πŸ”§ and ship πŸš€!

10.12.2024 19:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - microsoft/Trace: End-to-end Generative Optimization for AI Agents End-to-end Generative Optimization for AI Agents. Contribute to microsoft/Trace development by creating an account on GitHub.

This open-source project is a joint effort with
@chinganc_rl
and Adith, the MSR RL group. We are presenting Trace at the NeurIPS Expo Demo this afternoon 3pm-5pm PT. We have MUGs, T-SHIRTs, and STICKERs!

🌐 microsoft.github.io/Trace/
πŸ‘¨β€πŸ’» github.com/microsoft/Tr...

10.12.2024 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Once you build an agent with Trace, you can use ANY LLM optimizer you want. With the release of Trace 0.1.3, we introduce TextGrad (github.com/microsoft/Tr...) as an optimizer for the RL agent, along with OPRO and OptoPrime.

10.12.2024 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What enables Trace to be an RL-style agentic library? We use **Generative Optimization** techniques (LLM as an optimizer) to derive an analog to RL's policy gradient algorithm. The agent makes a move, receives feedback/reward, and updates its parameters.

10.12.2024 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In Trace, you define an Agent with declarative Python functions using Trace primitives. Trace provides flexible ways to mark what you want to change -- for example, we mark two prompts and two functions below as trainable.

10.12.2024 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

True RL agents learn online -- continuously changing themselves to improve upon the feedback (reward) from a user or an environment. Why haven't people done this in the LLM "Agentic" libraries? We wondered the same and developed Trace -- a true *RL-style* agentic framework.

10.12.2024 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Trace Overview This is "Trace Overview" by Allen Nie on Vimeo, the home for high quality videos and the people who love them.

Unveiling Trace v0.1.3 at NeurIPS 2024, a library for building an RL-style AI Agent that learns from the environment and human feedback. Today's LLM Agent libraries are not RL agents. They specify a workflow, and it remains unchanged regardless of user feedback. #NotRL vimeo.com/1036224270

10.12.2024 19:52 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

An honor to have you here!! Welcome πŸ™πŸ™

30.11.2024 04:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Anytime Acceleration of Gradient Descent This work investigates stepsize-based acceleration of gradient descent with {\em anytime} convergence guarantees. For smooth (non-strongly) convex optimization, we propose a stepsize schedule that all...

arxiv.org/abs/2411.17668 Our postdoc zihan slays another COLT open problem! proceedings.mlr.press/v247/kornows...

27.11.2024 13:03 β€” πŸ‘ 68    πŸ” 11    πŸ’¬ 1    πŸ“Œ 3

For people who like RL theory, this is a must follow!

26.11.2024 17:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ“Œ

25.11.2024 14:38 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can I get added? Not NLP but still working with LLMs on the RL side.

25.11.2024 02:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hello...world?

Trying to reconstruct my academic networks over here :) Follow me if we know each other or if you're interested in machine learning for healthcare/social equity! Please retweet, or resky, or whatever they call it over here.

23.11.2024 16:46 β€” πŸ‘ 73    πŸ” 9    πŸ’¬ 3    πŸ“Œ 0

πŸ“Œ

24.11.2024 01:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Totally β€” it’s a great list 😊

23.11.2024 18:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Here is a list of ML OSS & Open Source / Science enthusiasts I found on Bluesky πŸ¦‹

go.bsky.app/8MFcfXd

Let me know if you find such people here!

I'm still new here and probably the list misses many must-add people, so let's built it togetherπŸ’ͺ

21.11.2024 05:19 β€” πŸ‘ 114    πŸ” 49    πŸ’¬ 41    πŸ“Œ 4
Preview
GitHub - microsoft/Trace: End-to-end Generative Optimization for AI Agents End-to-end Generative Optimization for AI Agents. Contribute to microsoft/Trace development by creating an account on GitHub.

Hi, I’m one of the main maintainers of Trace: github.com/microsoft/Tr... and will use this platform to promote it and engage with the OSS community 🫑

23.11.2024 14:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This is kinda cool honestly

23.11.2024 01:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I see…well…hope they’ll include it soon πŸ˜•

23.11.2024 01:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

How to save/bookmark posts on πŸ¦‹?

23.11.2024 01:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 4    πŸ“Œ 0

Filled out so fast 😫 but I saw some friends who made to the list β€” happy for them instead πŸ₯³

21.11.2024 22:38 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I wanted to contribute to "Starter Pack Season" with one for Stanford NLP+HCI: go.bsky.app/VZBhuJ5

Here are some other great starter packs:

- CSS: go.bsky.app/GoEyD7d + go.bsky.app/CYmRvcK
- NLP: go.bsky.app/SngwGeS + go.bsky.app/JgneRQk
- HCI: go.bsky.app/p3TLwt
- Women in AI: go.bsky.app/LaGDpqg

15.11.2024 19:20 β€” πŸ‘ 25    πŸ” 10    πŸ’¬ 2    πŸ“Œ 2

@allenanie is following 20 prominent accounts