Daniel Fried @daniel-fried - Bluesky Profile

I'm excited about Andy's work -- generating scenarios that force LLMs to choose between conflicting values, allowing us to see which values they prioritize. Might be used for training in the future! We also show the importance of open-ended (vs multiple choice) evaluation.

03.10.2025 16:13 — 👍 3 🔁 0 💬 0 📌 0

The IVADO #Bootcamp marked the launch of the Thematic Semester on Autonomous #LLM Agents last week at the MIL Campus of @umontreal.ca. Over 4 days, researchers, experts, and #AI enthusiasts gathered for conferences, tutorials, and rich discussions, laying the groundwork for our next two workshops.

19.08.2025 14:37 — 👍 1 🔁 2 💬 1 📌 1

RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing We present RepoST, a scalable method to construct environments that provide execution feedback for repository-level code generation for both training and evaluation. Unlike existing works that aim to ...

Paper: arxiv.org/abs/2503.07358
Code: github.com/yiqingxyq/Re...

Work led by Yiqing Xie, with Alex Xie, Divyanshu Sheth, Pengfei Liu, @daniel-fried.bsky.social and @carolynrose.bsky.social

16.07.2025 18:33 — 👍 1 🔁 1 💬 0 📌 0

2) RepoST.
We automatically create executable environments from real GitHub repos, allowing us to train and evaluate models for function generation in real-world contexts.

Presenting at the CODEML workshop on Fri Jul 18th.
Also accepted to COLM, upcoming!

16.07.2025 18:33 — 👍 0 🔁 0 💬 1 📌 0

Agent Workflow Memory Despite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories. In contr...

Paper: arxiv.org/abs/2409.07429
Code: github.com/zorazrw/agen...

Work led by @zorazrw.bsky.social, with Jiayuan Mao and @gneubig.bsky.social

16.07.2025 18:33 — 👍 0 🔁 0 💬 1 📌 0

1) Agent Workflow Memory.
Allow agents to adapt online to carry out new tasks more accurately by inducing workflows for common sub-tasks.

Today (Wed 7/17): 4:30-7pm. West Exhibition Hall B2-B3 W-202):
Also at the CUA workshop, morning of Sat 7/19.

16.07.2025 18:33 — 👍 0 🔁 0 💬 1 📌 0

Excited to be presenting two of our papers at #ICML2025 and workshops, today through Saturday! Topics are memory for agents, and constructing coding environments for training & evaluation. See links below:

16.07.2025 18:30 — 👍 1 🔁 0 💬 1 📌 0

PragLM @ COLM '25 IMPORTANT DATES

Happy to announce the first workshop on Pragmatic Reasoning in Language Models — PragLM @ COLM 2025! 🎉
How do LLMs engage in pragmatic reasoning, and what core pragmatic capacities remain beyond their reach?
🌐 sites.google.com/berkeley.edu/praglm/
📅 Submit by June 23rd

28.05.2025 18:21 — 👍 41 🔁 18 💬 1 📌 4

Congrats Lucy!!

10.05.2025 20:11 — 👍 4 🔁 0 💬 0 📌 0

Wisconsin-Madison's tree-filled campus, next to a big shiny lake

A computer render of the interior of the new computer science, information science, and statistics building. A staircase crosses an open atrium with visibility across multiple floors

I'm joining Wisconsin CS as an assistant professor in fall 2026!! There, I'll continue working on language models, computational social science, & responsible AI. 🌲🧀🚣🏻‍♀️ Apply to be my PhD student!

Before then, I'll postdoc for a year in the NLP group at another UW 🏔️ in the Pacific Northwest

05.05.2025 19:54 — 👍 145 🔁 14 💬 16 📌 3

Inaugurating new acct to share work from my PhD student!

Wayne et al have been running a live eval platform Copilot Arena - a VSCode extension serving code completions from AI systems to real developers. See 🧵 for findings and preprint

Excited to be evaluating human-AI *workflows* holistically!

05.03.2025 17:01 — 👍 10 🔁 3 💬 0 📌 0

What if AI agents did software engineering like humans—seeing the screen & using any developer tool?

Introducing Programming with Pixels: an SWE environment where agents control VSCode via screen perception, typing & clicking to tackle diverse tasks.

programmingwithpixels.com

🧵

26.02.2025 17:17 — 👍 8 🔁 4 💬 1 📌 1

Interested in knowing more about LLMs agents and in contributing to this topic?🚀

📢We're thrilled to announce REALM: The first Workshop for Research on Agent Language Models 🤖 #ACL2025NLP in Vienna 🎻
We have an exciting lineup of speakers

🗓️ Submit your work by *March 1st*
@aclmeeting.bsky.social

23.01.2025 14:29 — 👍 13 🔁 4 💬 1 📌 1

Congrats Mohit!!

15.01.2025 17:07 — 👍 6 🔁 0 💬 1 📌 0

Thrilled to announce our new work TestGenEval, a benchmark that measures unit test generation and test completion capabilities. This work was done in collaboration with the FAIR CodeGen team.

Preprint: arxiv.org/abs/2410.00752
Leaderboard: testgeneval.github.io/leaderboard....

19.12.2024 20:59 — 👍 17 🔁 7 💬 1 📌 1

CMU LTI Language Technology for All Internship 2025 - Language Technologies Institute - School of Computer Science - Carnegie Mellon University The LTI is currently seeking applicants for the summer 2025 Language Technology for All Internship

CMU LTI is hosting predoc interns this summer, centered around "Language Technologies for All"! Please apply and circulate! lti.cs.cmu.edu/news-and-eve...

07.01.2025 22:42 — 👍 19 🔁 8 💬 1 📌 0

Natural Language to Code Translation with Execution Generative models of code, pretrained on large corpora of programs, have shown great success in translating natural language to code (Chen et al., 2021; Austin et al., 2021; Li et al., 2022, inter ali...

You can execute each generated function on a set of possible inputs to the function, group the functions according to the outputs, then choose the largest group: arxiv.org/abs/2204.11454 and Sec 4.6 of arxiv.org/abs/2203.07814, although I'm not sure what was done in these plots

06.01.2025 04:07 — 👍 7 🔁 0 💬 0 📌 0

So sorry to hear this, what a loss - such a kind and fun guy and his work is so creative.

02.01.2025 23:53 — 👍 1 🔁 0 💬 0 📌 0

Announcement #1: our call for papers is up! 🎉
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs @yoavartzi.com @eunsol.bsky.social @ranjaykrishna.bsky.social and @adtraghunathan.bsky.social

17.12.2024 15:48 — 👍 66 🔁 24 💬 0 📌 1

Daniel Fried

Latest posts by daniel-fried.bsky.social on Bluesky

@daniel-fried is following 20 prominent accounts