Karen Ullrich (s/h) ✈️ Neurips @karen-ullrich

YouTube video by Dr. Karen Ullrich - Machine Learning Research Starting with LLM Agents: Installing and Running OpenApps

If “getting started with agents” feels like setup hell — same.
So we made a starter tutorial:
First agent running in <14 minutes, no Docker/AWS.
Laptop + API key only.
👇 www.youtube.com/watch?v=gzNW...

10.12.2025 20:44 — 👍 2 🔁 0 💬 0 📌 0

Want to teach AI agents to use apps like humans? Get started with digital agents research using OpenApps, our new Python-based environment.

10.12.2025 15:44 — 👍 2 🔁 3 💬 1 📌 0

Start with OpenApps

⭐️ facebookresearch.github.io/OpenApps/
📖 arxiv.org/abs/2511.20766

🛠️ Want to build new apps or tasks? Contributions welcome.

Let’s push UI agents forward — together. 🚀

10.12.2025 20:20 — 👍 0 🔁 0 💬 0 📌 0

This isn’t just a benchmark — it’s a generator. 🏭

With simple YAML configs, you can spawn thousands of variants of 6 fully functional apps.

Swap themes, scramble layouts, inject “challenging” fonts…

If your agent really understands UIs, this will show it.

10.12.2025 20:20 — 👍 0 🔁 0 💬 1 📌 0

Why OpenApps?

✅ Unlimited data for evaluation and training
✅ Lightweight — single CPU, no Docker, no OS emulators
✅ Reliable rewards from ground-truth app state
✅ Plug-and-play with any RL Gym or multimodal LLM agent

10.12.2025 20:20 — 👍 0 🔁 0 💬 1 📌 0

Release Day 🎉

Meet OpenApps — a pure-Python, open-source ecosystem for stress-testing UI agents at scale.

Runs on a single CPU. Generates thousands of unique UI variations. And it reveals just how fragile today’s SOTA agents are.

(Yes, even GPT-4 and Claude struggle.)

10.12.2025 20:20 — 👍 5 🔁 0 💬 1 📌 0

Excited to be attending NeurIPS 2025 — my 10-year anniversary since I first walked into NeurIPS back in 2015, just before starting my PhD in Amsterdam.

Grateful for a decade of learning, growing, and being part of this incredible community.

If you’re around this year, would love to meet.

03.12.2025 00:27 — 👍 6 🔁 0 💬 0 📌 0

One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples. Learn more in our new paper arxiv.org/abs/2510.05152
w/ Jingtong Su, Jianyu Zhang, @karen-ullrich.bsky.social , and Léon Bottou.
🧵

09.10.2025 14:31 — 👍 31 🔁 3 💬 1 📌 2

Y’all, I am at #COLM this week, very excited to learn, and meet old and new friends. Please reach out on Whova!

06.10.2025 22:40 — 👍 5 🔁 0 💬 0 📌 0

Check out the full paper here: www.arxiv.org/pdf/2506.17052 🎓 Work by Jingtong Su, @kempelab.bsky.social, @nyudatascience.bsky.social , @aiatmeta.bsky.social

08.07.2025 13:49 — 👍 1 🔁 0 💬 0 📌 0

Plus, we generate importance maps showing where in the transformer the concept is encoded — providing interpretable insights into model internals.

08.07.2025 13:49 — 👍 0 🔁 0 💬 1 📌 0

SAMI: Diminishes or amplifies these modules to control the concept's influence

With SAMI, we can scale the importance of these modules — either amplifying or suppressing specific concepts.

08.07.2025 13:49 — 👍 0 🔁 0 💬 1 📌 0

SAMD: Finds the attention heads most correlated with a concept

Using SAMD, we find that only a few attention heads are crucial for a wide range of concepts—confirming the sparse, modular nature of knowledge in transformers.

08.07.2025 13:49 — 👍 0 🔁 0 💬 1 📌 0

How would you make an LLM "forget" the concept of dog — or any other arbitrary concept? 🐶❓

We introduce SAMD & SAMI — a novel, concept-agnostic approach to identify and manipulate attention modules in transformers.

08.07.2025 13:49 — 👍 14 🔁 1 💬 1 📌 1

Aligned Multi-Objective Optimization (A-🐮) has been accepted at #ICML2025! 🎉
We explore optimization scenarios where objectives align rather than conflict, introducing new scalable algorithms with theoretical guarantees. #MachineLearning #AI #Optimization

01.05.2025 19:19 — 👍 9 🔁 2 💬 0 📌 0

Screenshot of arxiv paper "EXACT BYTE-LEVEL PROBABILITIES FROM TOKENIZED LANGUAGE MODELS FOR FIM-TASKS AND MODEL ENSEMBLES."

🎉🎉 Our paper just got accepted to #ICLR2025! 🎉🎉

Byte-level LLMs without training and guaranteed performance? Curious how? Dive into our work! 📚✨

Paper: arxiv.org/abs/2410.09303
Github: github.com/facebookrese...

22.01.2025 20:57 — 👍 12 🔁 0 💬 0 📌 0

NeurIPS Poster Mission Impossible: A Statistical Perspective on Jailbreaking LLMsNeurIPS 2024

Thursday is busy:
9-11am I will be at the Meta AI Booth
12.30-2pm
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs (neurips.cc/virtual/2024...)
OR
End-To-End Causal Effect Estimation from Unstructured Natural Language Data (neurips.cc/virtual/2024...)

12.12.2024 16:29 — 👍 3 🔁 0 💬 0 📌 0

Starting with Fei-Fei Li’s talk 2.30, after that I will mostly be meeting people and wonder the poster sessions.

11.12.2024 19:39 — 👍 5 🔁 0 💬 0 📌 0

Folks, I am posting my NeurIPS schedule daily in hopes to see folks, thanks @tkipf.bsky.social for the idea ;)

11-12.30 WiML round tables
1.30-4 Beyond Decoding, Tutorial

10.12.2024 19:39 — 👍 5 🔁 0 💬 1 📌 0

I will be at #Neurips2024 next week to talk about these two papers and host a workshop on #NeuralCompression.

06.12.2024 16:54 — 👍 3 🔁 0 💬 0 📌 0

next one on the list is Yury Polyanskiy's "Information Theory: From Coding to Learning" which will hopefully hit the shelfs in February... can not wait

28.11.2024 15:49 — 👍 3 🔁 0 💬 0 📌 0

Pro-tip: Use massive black Friday deals at scientific publishing houses to for example buy a copy of @jmtomczak.bsky.social
book on generative modeling (long overdue)

28.11.2024 15:49 — 👍 10 🔁 0 💬 1 📌 0

🫠

20.11.2024 20:57 — 👍 0 🔁 0 💬 1 📌 0

Me

18.11.2024 18:10 — 👍 2 🔁 0 💬 0 📌 0

Me

18.11.2024 11:21 — 👍 0 🔁 0 💬 1 📌 0

What do you think do we need to sharpen our understanding of tokenization? Or will we soon be rid of it by developing models such as "MegaByte" by
Yu et al?
And add more paper to the threat!

30.10.2024 18:29 — 👍 4 🔁 0 💬 2 📌 0

Phan et al, found a method to mitigate some of the tokenization problems Karpathy mentioned by projecting tokens into byte space. The key to their method is to develop a map between statistically equivalent token and byte-level models.

30.10.2024 18:29 — 👍 3 🔁 1 💬 0 📌 0

In "The Foundations of Tokenization:
Statistical and Computational Concerns", Gastaldi et al. try to make first steps towards defining what a tokenizer should be and define properties it ought to have.

30.10.2024 18:27 — 👍 4 🔁 0 💬 0 📌 0

In "Toward a Theory of Tokenization in LLMs" Rajaraman et al., the authors discuss why we can think of tokenization to cause lower perplexity/ a better entropy bound.

30.10.2024 18:27 — 👍 4 🔁 0 💬 0 📌 0

A must watch entry point is @karpathy.bsky.social hy's "Let's build the GPT Tokenizer" video, where he discusses some tokenization problems.

30.10.2024 18:27 — 👍 4 🔁 0 💬 0 📌 0

Karen Ullrich (s/h) ✈️ Neurips

Latest posts by karen-ullrich.bsky.social on Bluesky

@karen-ullrich is following 19 prominent accounts