Sara Fish @sarafish - Bluesky Profile

Latest posts by sarafish.bsky.social on Bluesky

Scalable oversight / debate, to an extent

13.05.2025 16:02 — 👍 0 🔁 0 💬 0 📌 0

More details, including statistical significance, in the paper.

joint w/ Julia Shephard, Minkai Li, @yannaigonch.bsky.social , @ranshorrer.bsky.social

Paper: arxiv.org/abs/2503.18825
Code: github.com/sara-fish/ec... 6/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 0 📌 0

In addition to the EconEvals benchmarks, in the EconEvals “litmus tests”, we quantify tendencies of LLMs and LLM agents when faced with tradeoffs for which there is no objectively correct choice: for example efficiency vs. equality. 5/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 1 📌 0

(And a score of 70% on each of our benchmarks has a specific economic meaning. For example, 70% at pricing corresponds to capturing 70% of total possible profits. Very different from 70% accuracy at a closed-ended Q&A benchmark!) 4/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 1 📌 0

To forestall saturation, we can scale the difficulty of our benchmark questions by scaling parameters of the economic environment. Our HARD difficulty level is challenging: no LLM we test, including o3-mini, scores above 70%. (Low scores of o3-mini possibly driven by underexploration.) 3/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 1 📌 0

In EconEvals benchmarks, LLM agents repeatedly take actions in an economic environment, and must learn optimal actions via trial and error (a capability SoTA LLMs struggle with!) 2/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 1 📌 0

Screenshot of first page of "EconEvals: Benchmarks and Litmus Tests for LLM Agents in Unknown Environments"

New paper: "EconEvals: Benchmarks and Litmus Tests for LLM Agents in Unknown Environments"

We construct economic environments to measure the capabilities and tendencies of LLMs and LLM agents in pricing, procurement, task allocation and more. 1/6

04.04.2025 15:47 — 👍 1 🔁 0 💬 1 📌 0

This is part of a super interesting line of work. For this paper I got to help as a member of an AI-led team, the most fun treatment 🙂 (I may be biased). Current LLMs often "get stuck" on complex tasks like reproducing a paper, but we can expect some of these limitations to go away with time.

22.01.2025 20:59 — 👍 4 🔁 1 💬 0 📌 0

🚨Major new version🚨
Algorithmic Collusion by Large Language Models
Joint w/ @sarafish.bsky.social & @ranshorrer.bsky.social

LLMs are automating many business decisions. Pricing might be next (or is already).
What if multiple firms, in good faith, to use off-the-shelf-LLMs for pricing? 1/3
#EconSky

28.11.2024 02:14 — 👍 74 🔁 18 💬 5 📌 3

@sarafish is following 20 prominent accounts

Institute for Replication
@i4replication

The Institute for Replication (I4R) works to improve the credibility of science by promoting and conducting reproductions and replications. i4replication.org

Rohit Lamba
@rohitlamba

Economist, Himalayan boy, poetry enthusiast, foodie. Amor fati :) www.rohitlamba.com

Stanislav Fort
@stanislavfort

AI + security | Stanford PhD in AI & Cambridge physics | techno-optimism + alignment + progress + growth | 🇺🇸🇨🇿

Danilo J. Rezende
@danilojrezende

VP of AI Research, Principal Scientist @ EIT Oxford | ex-Director @ DeepMind Building models to accelerate fundamental sciences and medicine. Opinions my own. https://danilorezende.com/

Tom Everitt
@tom4everitt

AGI safety researcher at Google DeepMind, leading causalincentives.com Personal website: tomeveritt.se

Michael Tschannen
@mtschannen

Research Scientist @GoogleDeepMind. Representation learning for multimodal understanding and generation. mitscha.github.io

Arthur Conmy
@arthurconmy

Aspiring 10x reverse engineer at Google DeepMind

Jon Barron
@jonbarron

AI researcher at Google DeepMind. Synthesized views are my own. 📍SF Bay Area 🔗 http://jonbarron.info This feed is a partial mirror of https://twitter.com/jon_barron

Mike Mozer
@mcmozer

Research Scientist, Google DeepMind

Eric Todd @ COLM 2025
@ericwtodd

CS PhD Student, Northeastern University - Machine Learning, Interpretability https://ericwtodd.github.io

David Bau
@davidbau

Interpretable Deep Networks. http://baulab.info/ @davidbau

Vaishnavh Nagarajan
@vaishnavh

Foundations of AI. I like simple and minimal examples and creative ideas. I also like thinking about the next token 🧮🧸 Google Research | PhD, CMU | https://arxiv.org/abs/2504.15266 | https://arxiv.org/abs/2403.06963 vaishnavh.github.io

Martin Wattenberg
@wattenberg

Human/AI interaction. ML interpretability. Visualization as design, science, art. Professor at Harvard, and part-time at Google DeepMind.

Andrew Lee
@ajyl

Post-doc @ Harvard. PhD UMich. Spent time at FAIR and MSR. ML/NLP/Interpretability

Tim Kostolansky
@thkostolansky

modeling modeling prev @MIT tim0120.github.io

Eric J. Michaud
@ericjmichaud

PhD student at MIT. Trying to make deep neural networks among the best understood objects in the universe. 💻🤖🧠👽🔭🚀 ericjmichaud.com

Caden
@cadentj

Clément Dumas
@butanium

Master student at ENS Paris-Saclay / aspiring AI safety researcher / improviser Prev research intern @ EPFL w/ wendlerc.bsky.social and Robert West MATS Winter 7.0 Scholar w/ neelnanda.bsky.social https://butanium.github.io

Hidenori Tanaka
@hidenori8tanaka

Group Leader, CBS-NTT "Physics of Intelligence" Program at Harvard website: https://sites.google.com/view/htanaka/home

Ben Edelman
@benedelman

Thinking about how/why AI works/doesn't, and how to make it go well for us. Currently: AI Agent Security @ US AI Safety Institute benjaminedelman.com