Sebastian Farquhar @sebfar - Bluesky Profile

By default, LLM agents with long action sequences use early steps to undermine your evaluation of later steps; a big alignment risk.

Our new paper mitigates this, keeps the ability for long-term planning, and doesnt assume you can detect the undermining strategy. 👇

23.01.2025 15:47 — 👍 13 🔁 1 💬 0 📌 0

Reducing unnecessary action *does* drive growth. We are all more productive when we achieve the same things with fewer inputs, wasting citizens' time makes the whole country less productive. Create slack in people's lives and watch what they create with it!

21.01.2025 14:25 — 👍 1 🔁 0 💬 1 📌 0

Interesting analogy, because of course the Dreadnoughts were mostly militarily useless and were obsoleted by changing strategic considerations before they were ever deployed.

17.12.2024 09:24 — 👍 4 🔁 0 💬 0 📌 0

I desperately want to know what experience made you try out this prompt. Who hurt you?

10.12.2024 17:48 — 👍 2 🔁 0 💬 1 📌 0

Interesting. I guess I'm surprised that oil prices would have such a big effect on total fossil fuel CO2 emissions (presumably mostly coal over the period?). But maybe substitutability links them enough.

04.12.2024 20:29 — 👍 1 🔁 0 💬 1 📌 0

Actually just zoomed in on the data viewer. It does look like 1973 is the break point. Still curious about why the effect was so persistent.

04.12.2024 12:46 — 👍 1 🔁 0 💬 1 📌 0

Why did land use emissions shrink lots between 196-70 and then stop shrinking?

Why did the oil price shock lead to sustained flat per capita fossil fuel emissions? It was short. Also it started after the trend breaks.

04.12.2024 12:44 — 👍 0 🔁 0 💬 1 📌 0

I'm surprised that the per capita global emissions look like they are trending pretty flat from 1950ish, much earlier than I would have guessed. Presumably many people greatly increased their energy consumption after then? Do you know what is driving this?

03.12.2024 20:39 — 👍 1 🔁 0 💬 1 📌 0

Updated! Keep em coming.

26.11.2024 09:07 — 👍 2 🔁 0 💬 0 📌 0

@maosbot.bsky.social what do you think, do you belong on this list? I think most of your research isn't quite in this area but not sure how you self-identify on research focus at the moment.

25.11.2024 18:27 — 👍 0 🔁 0 💬 0 📌 0

Weak signal perhaps, but you are one of two accounts on Twitter that I genuinely miss here. If you did make the leap that would be lovely :D

25.11.2024 14:38 — 👍 2 🔁 0 💬 0 📌 0

Help me grow this starter pack for technical researchers working on AGI safety! go.bsky.app/D6P44sC Some flex, but aiming for mostly technical research rather than governance/strategy. Who am I missing?

25.11.2024 14:04 — 👍 28 🔁 9 💬 15 📌 1

Agreed. I basically don't believe the result at all. Seems like the memetic strength is it lets you feel well informed.

24.11.2024 00:57 — 👍 2 🔁 0 💬 0 📌 0

You too! Just DMed you :D

22.11.2024 18:21 — 👍 1 🔁 0 💬 0 📌 0

Strongly agree. On a cold winter day they are basically a pure comfort upgrade. Also great for hayfever.

22.11.2024 18:12 — 👍 2 🔁 0 💬 1 📌 0

The fact that every field that has tried to have a reproducibility crisis has been able to suggests that the way journals have done it for decades underinvests in finding critical flaws in papers and that retractions are too rare and late to depend on.

21.11.2024 10:36 — 👍 0 🔁 0 💬 0 📌 0

I've seen at least a couple cases where a very high effort public review identified a significant flaw that the reviewers had missed. Losing that would be a real cost.

20.11.2024 20:53 — 👍 0 🔁 0 💬 1 📌 0

Jakob N. Foerster - How To ML Paper

Little hat-tip to www.jakobfoerster.com/how-to-ml-pa... from Jakob Foerster and jsteinhardt.stat.berkeley.edu/blog/advice-... from Jacob Steinhardt who have excellent advice as well.

18.11.2024 20:09 — 👍 1 🔁 0 💬 0 📌 0

How to Write ML Papers This doc is aimed at students learning to write ML papers as well as more experienced writers. It isn’t about how to do the research itself, but about how to present it in a way that makes it impactfu...

Starting to prepare yourself to submit to ICML? Here are my tips on how to write well for an ML research audience. sebastianfarquhar.com/on-research/...

18.11.2024 20:06 — 👍 15 🔁 2 💬 2 📌 0

timhunkin/illegal engineering

Entertaining essay about how the decline in practical engineering education has been devastating for *checks notes* professional criminal safe crackers. (Ok, mostly just a fun history of safe cracking.) www.timhunkin.com/94_illegal_e...

15.11.2024 17:14 — 👍 1 🔁 0 💬 0 📌 0

And for readers! Twitter has been getting gradually more boring. Turns out this whole hyperlink thing is a big deal for the internet.

13.11.2024 16:02 — 👍 2 🔁 0 💬 0 📌 0

Something I loved most about the internet in the 2000s was the idiosyncratic personal webpages that some people had put a crazy amount of time and effort into.

These pages must still exist right? What are the best ones you know of?

13.11.2024 10:40 — 👍 6 🔁 1 💬 1 📌 0

Sebastian Farquhar

Latest posts by sebfar.bsky.social on Bluesky

@sebfar is following 20 prominent accounts