Feel free to check out my new LessWrong post for a high-level summary of this work! www.lesswrong.com/posts/dP8J6v...
04.12.2025 12:41 β π 0 π 0 π¬ 0 π 0@anayebi.bsky.social
Assistant Professor of Machine Learning, Carnegie Mellon University (CMU) Building a Natural Science of Intelligence π§ π€β¨ Prev: ICoN Postdoctoral Fellow @MIT, PhD @Stanford NeuroAILab Personal Website: https://cs.cmu.edu/~anayebi
Feel free to check out my new LessWrong post for a high-level summary of this work! www.lesswrong.com/posts/dP8J6v...
04.12.2025 12:41 β π 0 π 0 π¬ 0 π 0Matt's slides on Interactive World Models: www.cs.cmu.edu/~mgormley/co...
My slides on the Science of AI Alignment: www.cs.cmu.edu/~mgormley/co...
...and that's a wrap for Fall 2025! In the final lecture of the semester, Matt Gormley & I covered bleeding-edge research topics in Generative AI, namely Interactive World Models + Science of AI Alignment.
Next semester we plan to have our recordings publicly available on YouTube -- stay tuned!
The 2nd paper circumvents the first paper's main "no free lunch" barrier of encoding "all human values", by identifying small value sets that yield the *first* formal guarantees on corrigibility.
In the AAAI Machine Ethics Workshop (W37) Proceedings π:
bsky.app/profile/anay...
We have 2 papers accepted to #AAAI2026 this year!
The first paper π on intrinsic barriers to alignment (establishing no free lunch theorems of encoding "all human values" & the inevitability of reward hacking) will appear as an *oral* presentation at the Special Track on AI Alignment.
Slides: www.cs.cmu.edu/~mgormley/co...
Full course info: bsky.app/profile/anay...
In today's Generative AI lecture, we cover code generation & autonomous agents, discussing how Github Co-Pilot works, diving into multimodal agents (like Gemini 3 Pro!), and ending on AI scientists & AI for science. Lots more to explore in this rapidly growing space!
19.11.2025 21:21 β π 2 π 0 π¬ 1 π 0Join us December 5th at University of Toronto (in-person and online) for a special seminar by Dr. Aran Nayebi on reverse-engineering the brain and building neuroscience-inspired artificial intelligence.
#neuroAI #compneuro @anayebi.bsky.social @utoronto.ca @uoftcompsci.bsky.social
Slides: www.cs.cmu.edu/~mgormley/co...
Full course info: bsky.app/profile/anay...
In today's Generative AI lecture, we dive into reasoning models by dissecting how DeepSeek-R1 works (GRPO vs. PPO, which removes the need for a separate value network + training with a simpler rule-based reward), and end on mechanistic interpretability to better understand those reasoning traces.
10.11.2025 20:46 β π 4 π 0 π¬ 1 π 0Finally, we briefly discuss Querying Transformers for text-image alignment, as a hold-over from last lecture on multimodal foundation models!
23.10.2025 13:44 β π 1 π 0 π¬ 0 π 0We also discuss data quality & amount (where you get great performance with a smaller model trained on lots of tokens), how to get good data depending on your application, and Moravec's paradox for robotics foundation models.
23.10.2025 13:44 β π 1 π 0 π¬ 1 π 0In today's Generative AI lecture, we primarily discuss scaling laws and the key factors that go into building large-scale foundation models.
Slides: www.cs.cmu.edu/~mgormley/co...
Full course info: bsky.app/profile/anay...
Full paper (to appear in NeurIPS 2025!) here: arxiv.org/abs/2506.00138
21.10.2025 02:41 β π 2 π 1 π¬ 0 π 0Congratulations to my Ph.D. student Reece Keller for winning the best talk award at #CRSy25 on our project building the first task-optimized autonomous agent that predicts whole-brain data! Check out the post below for other cool talks!!
Detailed summary: bsky.app/profile/reec...
Congrats to this year's Nobel Prize winners!
Philippe's seminal work is in fact what our recent closed form UBI AI capability threshold builds on: bsky.app/profile/anay...
Thanks @undo-hubris.bsky.social for the invite & for hosting!
Slides: anayebi.github.io/files/slides...
Paper 1 (alignment barriers): arxiv.org/abs/2502.05934
Paper 1 summary: bsky.app/profile/anay...
Paper 2 (corrigibility): arxiv.org/abs/2507.20964
Paper 2 summary: bsky.app/profile/anay...
My ILIAD β25 talk, βIntrinsic Barriers & Pathways to Alignmentβ: why βaligning to all human valuesβ provably canβt work, why reward hacking is inevitable in large state spaces, & how small value sets bypass βno free lunchβ limits to yield formal corrigibility.
www.youtube.com/watch?v=Oajq...
A nice application of our NeuroAI Turing Test! Check out
@ithobani.bsky.social's thread for more details on comparing brains to machines!
Academic paper: bsky.app/profile/anay...
05.10.2025 15:23 β π 1 π 0 π¬ 0 π 0Honored to be quoted in this @newsweek.com article discussing how AI could accelerate the need for UBI.
Read more here: www.newsweek.com/ai-taking-jo...
Next time we discuss how to optimize these reward models via DPO/policy gradients!
Slides: www.cs.cmu.edu/~mgormley/co...
Full course info: bsky.app/profile/anay...
Specifically, we cover methods which don't involve parameter-updating, e.g. In-Context Learning / Prompt-Engineering / Chain-of-Thought Prompting, to methods that do, such as Instruction Fine-Tuning & building on IFT to perform full-fledged Reinforcement Learning from Human Feedback (RLHF).
01.10.2025 19:46 β π 1 π 0 π¬ 1 π 0In today's Generative AI lecture, we talk about all the different ways to take a giant auto-complete engine like an LLM and turn it into a useful chat assistant.
01.10.2025 19:46 β π 1 π 0 π¬ 1 π 0Slides: www.cs.cmu.edu/~mgormley/co...
Full course info: bsky.app/profile/anay...
In today's Generative AI lecture, we discuss the 4 primary approaches to Parameter-Efficient Fine-Tuning (PEFT): subset, adapters, Prefix/Prompt Tuning, and Low-Rank Adaptation (LoRA).
We show each of these amounts to finetuning a different aspect of the Transformer.
6/6 I close with reflections on AI safety and alignment, and the Q&A explores open questions: from building physically accurate (not just photorealistic) world models to the role of autoregression and scale.
π₯Watch here: www.youtube.com/watch?v=5deM...
Slides: anayebi.github.io/files/slides...
5/6 I also touch on the Contravariance Principle/Platonic Representation Hypothesis, our proposed NeuroAI Turing Test, and why embodied agents are essential for building not just more capable, but also more reliable, autonomous systems.
29.09.2025 14:02 β π 1 π 0 π¬ 1 π 04/6 This journey culminates in our first task-optimized βNeuroAgentβ, integrating advances in visual and tactile perception (including our NeurIPS β25 oral), mental simulation, memory, and intrinsic curiosity.
29.09.2025 14:02 β π 1 π 0 π¬ 1 π 03/6 By grounding agents in perception, prediction, planning, memory, and intrinsic motivation β and validating them against large-scale neural data from rodents, primates, and zebrafish β we show how neuroscience and machine learning can form a unified *science of intelligence*.
29.09.2025 14:02 β π 1 π 0 π¬ 1 π 0