Prithviraj "Raj" Ammanabrolu's Avatar

Prithviraj "Raj" Ammanabrolu

@rajammanabrolu.bsky.social

AI, RL, NLP, Games Asst Prof at UCSD Research Scientist at Nvidia Lab: http://pearls.ucsd.edu Personal: prithvirajva.com

4,122 Followers  |  244 Following  |  171 Posts  |  Joined: 28.04.2023
Posts Following

Posts by Prithviraj "Raj" Ammanabrolu (@rajammanabrolu.bsky.social)

Preview
TALES: Text Adventure Learning Environment Suite Reasoning is an essential skill to enable Large Language Models (LLMs) to interact with the world. As tasks become more complex, they demand increasingly sophisticated and diverse reasoning capabiliti...

My former MS student Chris Cui (now PhD student with @rajammanabrolu.bsky.social)motivates Text Adventure Games as testbeds for reasoning. Provides a new benchmark suite of text games. Observes that Zork still kicks LLM’s butts despite training on walkthroughs arxiv.org/abs/2504.14128

03.12.2025 00:42 β€” πŸ‘ 21    πŸ” 6    πŸ’¬ 0    πŸ“Œ 0

Ziyi Zhang, Shengqi Li (on PhD app market!) for multi agent D&D sim creation + RL openreview.net/pdf?id=3Op7k...

Me mostly if you want boba and beach recs (or NVIDIA full time RS roles I guess, I'm hiring a few ppl there but not UCSD)

24.11.2025 22:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Jenny Shen for pluralistic alignment + human feedback arxiv.org/abs/2510.01167

Chris Cui for text sims + RL + scalable oversight of reasoning models arxiv.org/abs/2504.14128

Lucas (on industry market) for how reasoning emerges from mid training/SFT to RL lucasdino.github.io/assets/files...

24.11.2025 22:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Bosung Kim for all things VLA, embodied AI, and long context memory arxiv.org/abs/2505.16928

Ruiyi Wang for multi turn agentic RL and all the RL infra in and outs arxiv.org/abs/2510.01132

24.11.2025 22:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

My entire PEARLS Lab, and many NVIDIA colleagues, will be at #neurips2025 to chat about their latest. Some papers accepted to the conf are already outdated so just reach out to. Thread 🧡

24.11.2025 22:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Yay congrats, Mark! Well deserved! It's def a required reading for all things comp storytelling (and creativity!)

12.11.2025 17:22 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I am extremely honored and humbled to have been awarded a Test-of-Time award for my 2005 paper "From Linear Story Generation to Branching Story Graphs" with R. Michael Young

12.11.2025 16:08 β€” πŸ‘ 72    πŸ” 5    πŸ’¬ 5    πŸ“Œ 1
Navigating the Safety-Capability Spectrum when Teaching Agents with Feedback -Prithviraj Ammanabrolu
YouTube video by IVADO Navigating the Safety-Capability Spectrum when Teaching Agents with Feedback -Prithviraj Ammanabrolu

I've done a few versions of this talk but this is the first that's been recorded publicly, thanks to IVADO Montreal

A good overview of things my lab has been up to in the last year or so at least in balancing safety/capabilities of (embodied) AI Agents

www.youtube.com/watch?v=S-kV...

03.11.2025 23:04 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

πŸ”₯Excited to share our new work: "A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning"!

We study what actually works for agentic multi-turn RL with varying 🌎Environment, πŸ€–Policy, and ⭐Reward.

We conduct various ablations and empirical analysis on 🧩TextWorld, πŸ§™ALFWorld, and πŸ§‘β€πŸ’»SWE-Gym.

26.10.2025 21:36 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
In-context Ranking Preference Optimization Recent developments in Direct Preference Optimization (DPO) allow large language models (LLMs) to function as implicit ranking models by maximizing the margin between preferred and non-preferred respo...

My students will be presenting IRPO arxiv.org/abs/2504.15477 and a paper on Personalized RLHF arxiv.org/abs/2504.07070 on Wed onwards

01.10.2025 18:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My students will be presenting papers next Wed/Thursday so be sure to check those out too

01.10.2025 17:22 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I'll be at #CoLM2025 and the IVADO agents workshop right before in Montreal. My students will be presenting two papers in the main conf. I'll also do a ws keynote where I'll talk about some of our latest. Come by and say hi next week!

01.10.2025 17:21 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm probably mostly going to stop posting on this site. There's close to no engagement and it's not worth the effort to cross post for the amount of time that takes. Find me elsewhere / email me

29.06.2025 20:54 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 4    πŸ“Œ 1
Post image Post image Post image

I recently left Mosaic/Databricks Research. It's been a ride building out the RL team from <4 ppl to 20+ across two companies & acquisition +figuring out RL as a Service in prod. Mosaic had insane talent density

Some "relaxation" while I put out Prof fires for a smol bit then new adventures!

19.06.2025 15:48 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
/call_for_papers Official website for the Wordplay Workshop at EMNLP 2025. Exploring interactive narratives, text-adventure games, and AI agents in language-based environments. Join us in Suzhou, China, November 5th-9...

If you work in the intersection of NLP and games/narrative, then this workshop is for you! wordplay-workshop.github.io/cfp/

Organized by the amazing @laramartin.net and @rajammanabrolu.bsky.social (among others)

17.06.2025 16:47 β€” πŸ‘ 10    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Post image

The thing that feels so off about the core tech world is that every convo is very transactional. Maybe true elsewhere too. "Oh you're an expert in RL, can you answer questions about my new startup?"

Every single (Bay) party. No I do not want to consult. I just wanna hang out.

16.06.2025 05:54 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Of all the labeling startups out there to acquihire, this was... an interesting choice. Says a lot actually

10.06.2025 17:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

. @bosungkim.bsky.social will be at #CVPR2025 in Nashville this week to present this and just generally talk about scaling memory for embodied agents!

Catch her at the poster sessions and also the Foundation Models meets Embodied Agents Workshop on Wed

09.06.2025 16:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yes AI for edu is a thing but almost all vanilla LLMs just railroad students into answers. Complete cognitive offload is not useful for improving learning outcomes

09.06.2025 15:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I've heard this personally from multiple PMs at AI companies. Students are one of the biggest demographics and they need to "break in" and have even more usage to improve their metrics. Classic corporate economic incentives

09.06.2025 15:24 β€” πŸ‘ 7    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

Tis the era of bringing back every AI benchmark ever but this time by the LLM people and for the LLMs

06.06.2025 22:57 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
talks.cam : Language Technology Lab Seminars

Had a fun little visit to Cambridge LTL where I talked about a bunch of my lab's latest papers including some still not public with the key takeaway that "RL can absolutely learn new things and is not just resurfacing knowledge"
talks.cam.ac.uk/show/archive...

05.06.2025 17:51 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

That's fair, I guess I should rephrase to "regardless of a possible common prior, it's nearly impossible for different providers to have the same representations pop out of their post trained LLM"

03.06.2025 04:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The moral of the story here is basically that who is making your LLM really matters. Internal use cases critical to their businesses will always influence data distributions and everything downstream of that. This is in contrast to things like Platonic Representation Hypothesis

03.06.2025 04:05 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Xi Jinping’s paranoid approach to AGI, debt crisis, & Politburo politics β€” Victor Shih
YouTube video by Dwarkesh Patel Xi Jinping’s paranoid approach to AGI, debt crisis, & Politburo politics β€” Victor Shih

Interesting tidbit from UCSD's Victor Shih on a podcast talking about Chinese AGI efforts is that Deepseek is good at Chinese govt doc understanding cause that's what affects stock prices most and DS is a hedge fund.
www.youtube.com/watch?v=b1Te...

03.06.2025 04:05 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
JEE Advanced 2025 topper list out, Rajit Gupta secures AIR 1 with 332 marks: Check complete list of toppers IIT Kanpur has released the topper list for JEE Advanced 2025. This year, Rajit Gupta achieved AIR 1 by scoring 332 out of 360 marks. Devdutta Majhi is the leading female candidate with AIR 16. Scroll...

The top two scores are 332 not 322 but other than the typo the rest of this list seems legit and consistent across multiple sources
www.indiatvnews.com/education/hi...

x.com/RejaullahmdM...

02.06.2025 06:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Looks like Gemini gets AIR 6 in #JEE2025 with a score of 323

Only 5 highschoolers in all India do better than an LLM in the single most important exam of their to get into the IITs

The legacy edu selection systems are now worse than useless

02.06.2025 06:54 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

I get prepping for worst case scenarios but a lot of AI Safety debates I somehow end up these days in boil down to "assume you have Machine God in a box, now tell me how to align it"

I could rant for hours but seriously y'all this isn't productive

01.06.2025 01:47 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Here for the afternoon shift!

24.05.2025 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning We introduce $\infty$-THOR, a new framework for long-horizon embodied tasks that advances long-context understanding in embodied AI. $\infty$-THOR provides: (1) a generation framework for synthesizing...

Paper: arxiv.org/abs/2505.16928
Website/code/data:
pearls-lab.github.io/infini-thor/

Led by @bosungkim.bsky.social who has done a fantastic job on this in the last bit. Full stack from Unity gamedev to Big Model Scaler. Watch out for her in the embodied agent space!

23.05.2025 16:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0