I'm excited about Andy's work -- generating scenarios that force LLMs to choose between conflicting values, allowing us to see which values they prioritize. Might be used for training in the future! We also show the importance of open-ended (vs multiple choice) evaluation.
03.10.2025 16:13 β π 3 π 0 π¬ 0 π 0
The IVADO #Bootcamp marked the launch of the Thematic Semester on Autonomous #LLM Agents last week at the MIL Campus of @umontreal.ca. Over 4 days, researchers, experts, and #AI enthusiasts gathered for conferences, tutorials, and rich discussions, laying the groundwork for our next two workshops.
19.08.2025 14:37 β π 1 π 2 π¬ 1 π 1
2) RepoST.
We automatically create executable environments from real GitHub repos, allowing us to train and evaluate models for function generation in real-world contexts.
Presenting at the CODEML workshop on Fri Jul 18th.
Also accepted to COLM, upcoming!
16.07.2025 18:33 β π 0 π 0 π¬ 1 π 0
1) Agent Workflow Memory.
Allow agents to adapt online to carry out new tasks more accurately by inducing workflows for common sub-tasks.
Today (Wed 7/17): 4:30-7pm. West Exhibition Hall B2-B3 W-202):
Also at the CUA workshop, morning of Sat 7/19.
16.07.2025 18:33 β π 0 π 0 π¬ 1 π 0
Excited to be presenting two of our papers at #ICML2025 and workshops, today through Saturday! Topics are memory for agents, and constructing coding environments for training & evaluation. See links below:
16.07.2025 18:30 β π 1 π 0 π¬ 1 π 0
PragLM @ COLM '25
IMPORTANT DATES
Happy to announce the first workshop on Pragmatic Reasoning in Language Models β PragLM @ COLM 2025! π
How do LLMs engage in pragmatic reasoning, and what core pragmatic capacities remain beyond their reach?
π sites.google.com/berkeley.edu/praglm/
π
Submit by June 23rd
28.05.2025 18:21 β π 41 π 18 π¬ 1 π 4
Congrats Lucy!!
10.05.2025 20:11 β π 4 π 0 π¬ 0 π 0
Wisconsin-Madison's tree-filled campus, next to a big shiny lake
A computer render of the interior of the new computer science, information science, and statistics building. A staircase crosses an open atrium with visibility across multiple floors
I'm joining Wisconsin CS as an assistant professor in fall 2026!! There, I'll continue working on language models, computational social science, & responsible AI. π²π§π£π»ββοΈ Apply to be my PhD student!
Before then, I'll postdoc for a year in the NLP group at another UW ποΈ in the Pacific Northwest
05.05.2025 19:54 β π 145 π 14 π¬ 16 π 3
Inaugurating new acct to share work from my PhD student!
Wayne et al have been running a live eval platform Copilot Arena - a VSCode extension serving code completions from AI systems to real developers. See π§΅ for findings and preprint
Excited to be evaluating human-AI *workflows* holistically!
05.03.2025 17:01 β π 10 π 3 π¬ 0 π 0
What if AI agents did software engineering like humansβseeing the screen & using any developer tool?
Introducing Programming with Pixels: an SWE environment where agents control VSCode via screen perception, typing & clicking to tackle diverse tasks.
programmingwithpixels.com
π§΅
26.02.2025 17:17 β π 8 π 4 π¬ 1 π 1
Interested in knowing more about LLMs agents and in contributing to this topic?π
π’We're thrilled to announce REALM: The first Workshop for Research on Agent Language Models π€ #ACL2025NLP in Vienna π»
We have an exciting lineup of speakers
ποΈ Submit your work by *March 1st*
@aclmeeting.bsky.social
23.01.2025 14:29 β π 13 π 4 π¬ 1 π 1
Congrats Mohit!!
15.01.2025 17:07 β π 6 π 0 π¬ 1 π 0
Thrilled to announce our new work TestGenEval, a benchmark that measures unit test generation and test completion capabilities. This work was done in collaboration with the FAIR CodeGen team.
Preprint: arxiv.org/abs/2410.00752
Leaderboard: testgeneval.github.io/leaderboard....
19.12.2024 20:59 β π 17 π 7 π¬ 1 π 1
So sorry to hear this, what a loss - such a kind and fun guy and his work is so creative.
02.01.2025 23:53 β π 1 π 0 π¬ 0 π 0
Announcement #1: our call for papers is up! π
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs @yoavartzi.com @eunsol.bsky.social @ranjaykrishna.bsky.social and @adtraghunathan.bsky.social
17.12.2024 15:48 β π 66 π 24 π¬ 0 π 1
IVADO est un consortium de recherche, de formation et de mobilisation des connaissances qui a pour mission de bΓ’tir et de promouvoir une IA robuste, raisonnante et responsable.
dynomight.net
space invasion
Social science and other distractions. Old posts get deleted pretty quick.
https://kieranhealy.org /
https://theordinalsociety.com
asst prof @ NTU, ex principal scientist @ autodesk, phd mit 2019.
I make programming more communicative π§ βοΈπ€
Assistant Prof. at Georgia Tech | NVIDIA AI | Making robots smarter
Ph.D. student at University of Washington CSE. NLP. IBM Ph.D. fellow (2022-2023). Meta student researcher (2023-) . βοΈ π πββοΈπ§ββοΈπ³
CS Ph.D. at CMU. Building Copilot Arena. Editor at http://blog.ml.cmu.edu
Research in generative AI for **human** creativity in music + more.
Assistant professor at CMU CSD, leading the πΌ G-CLef lab. Part time research scientist at Google DeepMind on the Magenta team (views my own)
The world's leading venue for collaborative research in theoretical computer science. Follow us at http://YouTube.com/SimonsInstitute.
PhD Student @ltiatcmu.bsky.social. Working on reasoning, code-gen agents and test-time compute.
Final year PhD Student in Computer Science @Stanford
Work on:
- Compositionality, syntax (language structure)
- Web Agents: Synthetic data, tree search, exploration (language interpretation)
a mediocre combination of a mediocre AI scientist, a mediocre physicist, a mediocre chemist, a mediocre manager and a mediocre professor.
see more at https://kyunghyuncho.me/
Research Scientist at Ai2, PhD in NLP π€ UofA. Ex
GoogleDeepMind, MSFTResearch, MilaQuebec
https://nouhadziri.github.io/
AI safety at Anthropic, on leave from a faculty job at NYU.
Views not employers'.
I think you should join Giving What We Can.
cims.nyu.edu/~sbowman
ai research @ thinking machines . realtime video+voice. i like trains and bikes. sometimes I climb rocks and throw pottery.
Blog: https://argmin.substack.com/
Webpage: https://people.eecs.berkeley.edu/~brecht/
asst prof @Stanford linguistics | director of social interaction lab π± | bluskies about computational cognitive science & language
Full professor at University of A CoruΓ±a. Member of LyS group, CITIC. Researcher in natural language processing/computational linguistics. http://www.grupolys.org/~cgomezr
Cognitive scientist at Stanford. Open science advocate. Symbolic Systems Program director. Bluegrass picker, slow runner, dad. http://langcog.stanford.edu