Prithviraj "Raj" Ammanabrolu (@rajammanabrolu) — Bluesky Profile

3 months ago

TALES: Text Adventure Learning Environment Suite Reasoning is an essential skill to enable Large Language Models (LLMs) to interact with the world. As tasks become more complex, they demand increasingly sophisticated and diverse reasoning capabiliti...

My former MS student Chris Cui (now PhD student with @rajammanabrolu.bsky.social)motivates Text Adventure Games as testbeds for reasoning. Provides a new benchmark suite of text games. Observes that Zork still kicks LLM’s butts despite training on walkthroughs arxiv.org/abs/2504.14128

21 6 0 0

3 months ago

Ziyi Zhang, Shengqi Li (on PhD app market!) for multi agent D&D sim creation + RL openreview.net/pdf?id=3Op7k...

Me mostly if you want boba and beach recs (or NVIDIA full time RS roles I guess, I'm hiring a few ppl there but not UCSD)

1 0 0 0

3 months ago

Jenny Shen for pluralistic alignment + human feedback arxiv.org/abs/2510.01167

Chris Cui for text sims + RL + scalable oversight of reasoning models arxiv.org/abs/2504.14128

Lucas (on industry market) for how reasoning emerges from mid training/SFT to RL lucasdino.github.io/assets/files...

2 0 1 0

3 months ago

Bosung Kim for all things VLA, embodied AI, and long context memory arxiv.org/abs/2505.16928

Ruiyi Wang for multi turn agentic RL and all the RL infra in and outs arxiv.org/abs/2510.01132

1 0 1 0

3 months ago

My entire PEARLS Lab, and many NVIDIA colleagues, will be at #neurips2025 to chat about their latest. Some papers accepted to the conf are already outdated so just reach out to. Thread 🧵

1 0 1 0

3 months ago

Yay congrats, Mark! Well deserved! It's def a required reading for all things comp storytelling (and creativity!)

1 0 0 0

3 months ago

I am extremely honored and humbled to have been awarded a Test-of-Time award for my 2005 paper "From Linear Story Generation to Branching Story Graphs" with R. Michael Young

72 5 5 1

4 months ago

Navigating the Safety-Capability Spectrum when Teaching Agents with Feedback -Prithviraj Ammanabrolu YouTube video by IVADO

I've done a few versions of this talk but this is the first that's been recorded publicly, thanks to IVADO Montreal

A good overview of things my lab has been up to in the last year or so at least in balancing safety/capabilities of (embodied) AI Agents

www.youtube.com/watch?v=S-kV...

8 1 0 0

4 months ago

🔥Excited to share our new work: "A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning"!

We study what actually works for agentic multi-turn RL with varying 🌎Environment, 🤖Policy, and ⭐Reward.

We conduct various ablations and empirical analysis on 🧩TextWorld, 🧙ALFWorld, and 🧑‍💻SWE-Gym.

9 2 1 0

5 months ago

In-context Ranking Preference Optimization Recent developments in Direct Preference Optimization (DPO) allow large language models (LLMs) to function as implicit ranking models by maximizing the margin between preferred and non-preferred respo...

My students will be presenting IRPO arxiv.org/abs/2504.15477 and a paper on Personalized RLHF arxiv.org/abs/2504.07070 on Wed onwards

1 0 0 0

5 months ago

My students will be presenting papers next Wed/Thursday so be sure to check those out too

3 0 1 0

5 months ago

I'll be at #CoLM2025 and the IVADO agents workshop right before in Montreal. My students will be presenting two papers in the main conf. I'll also do a ws keynote where I'll talk about some of our latest. Come by and say hi next week!

8 0 1 0

8 months ago

I'm probably mostly going to stop posting on this site. There's close to no engagement and it's not worth the effort to cross post for the amount of time that takes. Find me elsewhere / email me

5 0 4 1

8 months ago

I recently left Mosaic/Databricks Research. It's been a ride building out the RL team from <4 ppl to 20+ across two companies & acquisition +figuring out RL as a Service in prod. Mosaic had insane talent density

Some "relaxation" while I put out Prof fires for a smol bit then new adventures!

7 0 0 0

8 months ago

/call_for_papers Official website for the Wordplay Workshop at EMNLP 2025. Exploring interactive narratives, text-adventure games, and AI agents in language-based environments. Join us in Suzhou, China, November 5th-9...

If you work in the intersection of NLP and games/narrative, then this workshop is for you! wordplay-workshop.github.io/cfp/

Organized by the amazing @laramartin.net and @rajammanabrolu.bsky.social (among others)

10 5 0 0

8 months ago

The thing that feels so off about the core tech world is that every convo is very transactional. Maybe true elsewhere too. "Oh you're an expert in RL, can you answer questions about my new startup?"

Every single (Bay) party. No I do not want to consult. I just wanna hang out.

9 0 1 0

9 months ago

Of all the labeling startups out there to acquihire, this was... an interesting choice. Says a lot actually

0 0 1 0

9 months ago

. @bosungkim.bsky.social will be at #CVPR2025 in Nashville this week to present this and just generally talk about scaling memory for embodied agents!

Catch her at the poster sessions and also the Foundation Models meets Embodied Agents Workshop on Wed

2 0 0 0

9 months ago

Yes AI for edu is a thing but almost all vanilla LLMs just railroad students into answers. Complete cognitive offload is not useful for improving learning outcomes

2 0 0 0

9 months ago

I've heard this personally from multiple PMs at AI companies. Students are one of the biggest demographics and they need to "break in" and have even more usage to improve their metrics. Classic corporate economic incentives

7 4 1 0

9 months ago

Tis the era of bringing back every AI benchmark ever but this time by the LLM people and for the LLMs

2 1 0 0

9 months ago

talks.cam : Language Technology Lab Seminars

Had a fun little visit to Cambridge LTL where I talked about a bunch of my lab's latest papers including some still not public with the key takeaway that "RL can absolutely learn new things and is not just resurfacing knowledge"
talks.cam.ac.uk/show/archive...

4 1 0 0

9 months ago

That's fair, I guess I should rephrase to "regardless of a possible common prior, it's nearly impossible for different providers to have the same representations pop out of their post trained LLM"

0 0 1 0

9 months ago

The moral of the story here is basically that who is making your LLM really matters. Internal use cases critical to their businesses will always influence data distributions and everything downstream of that. This is in contrast to things like Platonic Representation Hypothesis

5 0 1 0

9 months ago

Xi Jinping’s paranoid approach to AGI, debt crisis, & Politburo politics — Victor Shih YouTube video by Dwarkesh Patel

Interesting tidbit from UCSD's Victor Shih on a podcast talking about Chinese AGI efforts is that Deepseek is good at Chinese govt doc understanding cause that's what affects stock prices most and DS is a hedge fund.
www.youtube.com/watch?v=b1Te...

5 0 1 0

9 months ago

JEE Advanced 2025 topper list out, Rajit Gupta secures AIR 1 with 332 marks: Check complete list of toppers IIT Kanpur has released the topper list for JEE Advanced 2025. This year, Rajit Gupta achieved AIR 1 by scoring 332 out of 360 marks. Devdutta Majhi is the leading female candidate with AIR 16. Scroll...

The top two scores are 332 not 322 but other than the typo the rest of this list seems legit and consistent across multiple sources
www.indiatvnews.com/education/hi...

x.com/RejaullahmdM...

0 0 0 0

9 months ago

Looks like Gemini gets AIR 6 in #JEE2025 with a score of 323

Only 5 highschoolers in all India do better than an LLM in the single most important exam of their to get into the IITs

The legacy edu selection systems are now worse than useless

4 0 1 1

9 months ago

I get prepping for worst case scenarios but a lot of AI Safety debates I somehow end up these days in boil down to "assume you have Machine God in a box, now tell me how to align it"

I could rant for hours but seriously y'all this isn't productive

8 0 1 0

9 months ago

Here for the afternoon shift!

1 0 0 0

9 months ago

Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning We introduce $\infty$-THOR, a new framework for long-horizon embodied tasks that advances long-context understanding in embodied AI. $\infty$-THOR provides: (1) a generation framework for synthesizing...

Paper: arxiv.org/abs/2505.16928
Website/code/data:
pearls-lab.github.io/infini-thor/

Led by @bosungkim.bsky.social who has done a fantastic job on this in the last bit. Full stack from Unity gamedev to Big Model Scaler. Watch out for her in the embodied agent space!

2 0 0 0