Kunal Jha's Avatar

Kunal Jha

@kjha02.bsky.social

CS PhD Student @University of Washington, CSxPhilosophy @Dartmouth College Interested in MARL, Social Reasoning, and Collective Decision making in people, machines, and other organisms kjha02.github.io

104 Followers  |  351 Following  |  31 Posts  |  Joined: 21.11.2024  |  2.0383

Latest posts by kjha02.bsky.social on Bluesky

Modeling Others’ Minds as Code How can AI quickly and accurately predict the behaviors of others? We show an AI which uses Large Language Models to synthesize agent behavior into Python programs, then Bayesian Inference to reason a...

Sorry, try kjha02.github.io/publication/...

04.10.2025 19:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Modeling Others’ Minds as Code How can AI quickly and accurately predict the behaviors of others? We show an AI which uses Large Language Models to synthesize agent behavior into Python programs, then Bayesian Inference to reason a...

Sorry the emoji text got messed up, url is kjha02.github.io/publication/...

03.10.2025 23:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Correctly tagging @aydanhuang265.bsky.social !

03.10.2025 17:28 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Modeling Others’ Minds as Code How can AI quickly and accurately predict the behaviors of others? We show an AI which uses Large Language Models to synthesize agent behavior into Python programs, then Bayesian Inference to reason a...

For more analyses and insights, check out the paper and code: shorturl.at/siUYI

Can’t thank my collaborators @aydan_huang265, @EricYe29011995, @natashajaques.bsky.social , @maxkw.bsky.social enough for all the help and support!!!

03.10.2025 02:27 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The big takeaway: framing behavior prediction as a program synthesis problem is an accurate, scalable, and efficient path to human-compatible AI!

It allows multi-agent systems to rapidly and accurately anticipate others' actions for more effective collaboration.

03.10.2025 02:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

ROTE doesn’t sacrifice accuracy for speed!

While initial program generation takes time, the inferred code can be executed rapidly, making it orders of magnitude more efficient than other LLM-based methods for long-horizon predictions.

03.10.2025 02:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What explains this performance gap? ROTE handles complexity better. It excels with intricate tasks like cleaning and interacting with objects (e.g., turning items on/off) in Partnr, while baselines only showed success with simpler navigation and object manipulation.

03.10.2025 02:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We scaled up to the embodied robotics simulator Partnr, a complex, partially observable environment with goal-directed LLM-agents.

ROTE still significantly outperformed all LLM-based and behavior cloning baselines for high-level action prediction in this domain!

03.10.2025 02:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

A key strength of code: zero-shot generalization.

Programs inferred from one environment transfer to new settings more effectively than all other baselines. ROTE's learned programs transfer without needing to re-incur the cost of text generation.

03.10.2025 02:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Can scripts model nuanced, real human behavior?

We collected human gameplay data and found ROTE not only outperformed all baselines but also achieved human-level performance when predicting the trajectories of real people!

03.10.2025 02:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Introducing ROTE (Representing Others’ Trajectories as Executables)!

We use LLMs to generate Python programs πŸ’» that model observed behavior, then uses Bayesian inference to select the most likely ones. The result: A dynamic, composable, and analyzable predictive representation!

03.10.2025 02:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Traditional AI is stuck! Predicting behavior is either brittle (Behavior Cloning) or too slow with endless belief space enumeration (Inverse Planning).

How can we avoid mental state dualism while building scalable, robust predictive models?

03.10.2025 02:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

Forget modeling every belief and goal! What if we represented people as following simple scripts instead (i.e "cross the crosswalk")?

Our new paper shows AI which models others’ minds as Python code πŸ’» can quickly and accurately predict human behavior!

shorturl.at/siUYI%F0%9F%...

03.10.2025 02:24 β€” πŸ‘ 36    πŸ” 14    πŸ’¬ 3    πŸ“Œ 3
Person standing next to poster titled "When Empowerment Disempowers"

Person standing next to poster titled "When Empowerment Disempowers"

Still catching up on my notes after my first #cogsci2025, but I'm so grateful for all the conversations and new friends and connections! I presented my poster "When Empowerment Disempowers" -- if we didn't get the chance to chat or you would like to chat more, please reach out!

06.08.2025 22:31 β€” πŸ‘ 16    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1
Preview
Evolving general cooperation with a Bayesian theory of mind | PNAS Theories of the evolution of cooperation through reciprocity explain how unrelated self-interested individuals can accomplish more together than th...

Our new paper is out in PNAS: "Evolving general cooperation with a Bayesian theory of mind"!

Humans are the ultimate cooperators. We coordinate on a scale and scope no other species (nor AI) can match. What makes this possible? 🧡

www.pnas.org/doi/10.1073/...

22.07.2025 06:03 β€” πŸ‘ 92    πŸ” 36    πŸ’¬ 2    πŸ“Œ 2

Really pumped for my Oral presentation on this work today!!! Come check out the RL session from 3:30-4:30pm in West Ballroom B

You can also swing by our poster from 4:30-7pm in West Exhibition Hall B2-B3 # W-713

See you all there!

15.07.2025 14:46 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I'll be at ICML next week! If anyone wants to chat about single/multi-agent RL, continual learning, cognitive science, or something else, shoot me a message!!!

08.07.2025 13:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Oral @icmlconf.bsky.social !!! Can't wait to share our work and hear the community's thoughts on it, should be a fun talk!

Can't thank my collaborators enough: @cogscikid.bsky.social y.social @liangyanchenggg @simon-du.bsky.social @maxkw.bsky.social @natashajaques.bsky.social

09.06.2025 16:32 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination How can AI develop the ability to cooperate with novel people on novel problems? We show AI learning to cooperate in β€œself-play” with one partner on many environments helps agents meta-learn to cooper...

For the paper and code, see kjha02.github.io/publication/...

Thank you so much to my collaborators @cogscikid.bsky.social @liangyanchenggg @
@simon-du.bsky.social @maxkw.bsky.social @natashajaques.bsky.social for making the first publication of my PhD a fun one!!!

19.04.2025 00:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The big takeaway: Environment diversity > Partner diversity

Training across diverse tasks teaches agents how to cooperate, not just whom to cooperate with. This enables zero-shot coordination with novel partners in novel environments, a critical step toward human-compatible AI.

19.04.2025 00:09 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - wcarvalho/nicewebrl: Python library for easily making web Apps to compare humans and AI Python library for easily making web Apps to compare humans and AI - wcarvalho/nicewebrl

Our work used NiceWebRL, a Python-based package we helped develop for evaluating Human, Human-AI, and Human-Human gameplay on Jax-based RL environments!

This tool makes crowdsourcing data for CS and CogSci studies easier than ever!

Learn more: github.com/wcarvalho/ni...

19.04.2025 00:09 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1
Post image

Why do humans prefer CEC agents? They collide less and adapt better to human behavior.
This increased adaptability reflects general norms for cooperation learned across many environments, not just memorized strategies.

19.04.2025 00:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Human studies confirm our findings! CEC agents achieve higher success rates with human partners than population based methods like FCP and are rated qualitatively better to collaborate with than the SOTA approach (E3T) despite never having seen the level during training.

19.04.2025 00:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Using empirical game theory analysis, we show CEC agents emerge as the dominant strategy in a population of different agent types during Ad-hoc Teamplay!

When diverse agents must collaborate, the CEC-trained agents are selected for their adaptability and cooperative skills.

19.04.2025 00:08 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The result? CEC agents significantly outperform baselines when collaborating zero-shot with novel partners on novel environments.

Even more impressive: CEC agents outperform methods that were specifically trained on the test environment but struggle to adapt to new partners!

19.04.2025 00:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We built a Jax-based procedural generator creating billions of solvable Overcooked challenges.

Unlike prior work studying only 5 layouts, we can now study cooperative skill transfer at unprecedented scale (1.16e17 possible environments)!

Code available at: shorturl.at/KxAjW

19.04.2025 00:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We introduce Cross-Environment Cooperation (CEC), where agents learn through self-play across procedurally generated environments.

CEC teaches robust task representations rather than memorized strategies, enabling zero-shot coordination with humans and other AIs!

19.04.2025 00:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Current AI cooperation algos form brittle strategies by focusing on partner diversity in fixed tasks.

I.e. they might learn a specific handshake but fail when greeted with a fist bump.

How can AI learn general norms that work across contexts and partners?

19.04.2025 00:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Our new paper (first one of my PhD!) on cooperative AI reveals a surprising insight: Environment Diversity > Partner Diversity.

Agents trained in self-play across many environments learn cooperative norms that transfer to humans on novel tasks.

shorturl.at/fqsNN%F0%9F%...

19.04.2025 00:06 β€” πŸ‘ 25    πŸ” 7    πŸ’¬ 1    πŸ“Œ 5

That's true! I think the significance of not assigning the same meaning to symbols may only matter when we're interacting with the agent, but definitely room to explore what we mean by "understand" here and how anyone can learn to grasp the full affordances of objects with underspecified properties!

07.01.2025 05:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@kjha02 is following 20 prominent accounts