Matt Beane's Avatar

Matt Beane

@mattbeane.bsky.social

Studying work involving intelligent machines, especially robots. @MITSloan PhD, @Ucsb Asst Prof, @Stanford and @MIT Digital Fellow, @Tedtalks @Thinkers50

496 Followers  |  131 Following  |  68 Posts  |  Joined: 22.06.2023  |  2.2883

Latest posts by mattbeane.bsky.social on Bluesky

Computer-mediated carcinisation

20.05.2025 01:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This includes many of my papers, too. The point I am making is the findings in careful academic research likely represents a lower bound of AI capabilities at this point.

15.05.2025 22:16 โ€” ๐Ÿ‘ 51    ๐Ÿ” 4    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1
Post image

I canโ€™t

i just โ€ฆ

i canโ€™t

www.404media.co/anthropic-cl...

04.02.2025 13:30 โ€” ๐Ÿ‘ 993    ๐Ÿ” 308    ๐Ÿ’ฌ 33    ๐Ÿ“Œ 91

I bet if someone *has* succeeded, it's via spinning up an elicitation-GPT that just drilled you for critical intel, wouldn't let you weasel out via under/overspecified output, then dumped it all back to you in standardized format so you could think faster - basically exporting your extraction algo.

30.01.2025 20:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Exactly. If we overheard Dario, Sam, and Demis chatting about certain well known AI critics, I'd be willing to bet they'd be expressing gratitude. Proving a grouch wrong is a real motivator.

29.01.2025 19:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Hi Everyone!

We're hosting our Wharton AI and the Future of Work Conference on 5/21-22. Last year was a great event with some of the top papers on AI and work.

Paper submission deadline is 3/3. Come join us! Submit papers here: forms.gle/ozJ5xEaktXDE...

29.01.2025 18:46 โ€” ๐Ÿ‘ 16    ๐Ÿ” 15    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2

Exciting new hobby project in the offing related to AI and skill. Involves a childhood passion, a wild leap into the unknown, made real via an order from Amazon just now. Will be 100% cool, I will be documenting things, sharing eventually. Feels like April 2023 again!

15.01.2025 05:07 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The Silo is so good. Just superb. This generation's answer to the BSG remake.

13.01.2025 01:44 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

My hobby horse. You can simulate a rocket all you want, and use more energy on computation than the actual rocket would, but you won't get to orbit until you ignite rocket fuel. What if all the energy we are spending on simulating learning is not the juice we really need to make intelligence?

09.01.2025 08:49 โ€” ๐Ÿ‘ 58    ๐Ÿ” 11    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 0

    The GPT-4 barrier was comprehensively broken
    Some of those GPT-4 models run on my laptop
    LLM prices crashed, thanks to competition and increased efficiency
    Multimodal vision is common, audio and video are starting to emerge
    Voice and live camera mode are science fiction come to life
    Prompt driven app generation is a commodity already
    Universal access to the best models lasted for just a few short months
    โ€œAgentsโ€ still havenโ€™t really happened yet
    Evals really matter
    Apple Intelligence is bad, Appleโ€™s MLX library is excellent
    The rise of inference-scaling โ€œreasoningโ€ models
    Was the best currently available LLM trained in China for less than $6m?
    The environmental impact got better
    The environmental impact got much, much worse
    The year of slop
    Synthetic training data works great
    LLMs somehow got even harder to use
    Knowledge is incredibly unevenly distributed
    LLMs need better criticism
    Everything tagged โ€œllmsโ€ on my blog in 2024

The GPT-4 barrier was comprehensively broken Some of those GPT-4 models run on my laptop LLM prices crashed, thanks to competition and increased efficiency Multimodal vision is common, audio and video are starting to emerge Voice and live camera mode are science fiction come to life Prompt driven app generation is a commodity already Universal access to the best models lasted for just a few short months โ€œAgentsโ€ still havenโ€™t really happened yet Evals really matter Apple Intelligence is bad, Appleโ€™s MLX library is excellent The rise of inference-scaling โ€œreasoningโ€ models Was the best currently available LLM trained in China for less than $6m? The environmental impact got better The environmental impact got much, much worse The year of slop Synthetic training data works great LLMs somehow got even harder to use Knowledge is incredibly unevenly distributed LLMs need better criticism Everything tagged โ€œllmsโ€ on my blog in 2024

Here's my end-of-year review of things we learned out about LLMs in 2024 - we learned a LOT of things simonwillison.net/2024/Dec/31/...

Table of contents:

31.12.2024 18:10 โ€” ๐Ÿ‘ 653    ๐Ÿ” 148    ๐Ÿ’ฌ 28    ๐Ÿ“Œ 47

In 2024 we learned a lot about how AI is impacting work. People report that they're saving 30 minutes a day using AI (aka.ms/nfw2024), and randomized controlled trials reveal theyโ€™re creating 10% more documents, reading 11% fewer e-mails, and spending 4% less time on e-mail (aka.ms/productivity...).

31.12.2024 19:39 โ€” ๐Ÿ‘ 16    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image Post image

Independent evaluations of OpenAIโ€™s o3 suggest that it passed math & reasoning benchmarks that were previously considered far out of reach for AI including achieving a score on ARC-AGI that was associated with actually achieving AGI (though the creators of the benchmark donโ€™t think it o3 is AGI)

20.12.2024 18:26 โ€” ๐Ÿ‘ 141    ๐Ÿ” 30    ๐Ÿ’ฌ 13    ๐Ÿ“Œ 8

Just *one* of the reasons that Blindsight was ahead of its time. Way ahead.

20.12.2024 16:36 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Massive congrats!! So excited to check it out.

14.12.2024 14:42 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1

Wow!

10.12.2024 20:54 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Join me by the fireside this Friday with Matt Beane as we dive into one of todayโ€™s biggest workforce challenges: upskilling at scale. ๐Ÿ“ˆ

Linke below to hear the full discussion on Friday, December 13 at 11 am EST!

linktr.ee/RitaMcGrath

@mattbeane.bsky.social

09.12.2024 18:45 โ€” ๐Ÿ‘ 4    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I propose a workshop.

Most engineers/CS working on AI presume away well established, profound brakes on AI diffusion.

Most social scientists presume away how AI use could reshape those brakes.

Let's gather these groups, examine these brakes 1-by-1, make grounded predictions.

07.12.2024 19:12 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Models like o1 suggest that people wonโ€™t generally notice AGI-ish systems that are better than humans at most intellectual tasks, but which are not autonomous or self-directed

Most folks donโ€™t regularly have a lot of tasks that bump up against the limits of human intelligence, so wonโ€™t see it

07.12.2024 00:49 โ€” ๐Ÿ‘ 155    ๐Ÿ” 26    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 2

Grateful for the opportunity to visit and learn from the professionals at the L&DI conference. And very glad to hear you found my talk so valuable, Garth! Means a lot.

04.12.2024 14:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

I made an HRI Starter Pack!

If you are a Human-Robot Interaction or Social Robotics researcher and I missed you while scrolling through bsky's suggestions, just ping me and I'll add ya.

go.bsky.app/CsnNn3s

03.12.2024 18:37 โ€” ๐Ÿ‘ 42    ๐Ÿ” 14    ๐Ÿ’ฌ 11    ๐Ÿ“Œ 2
The Avatar Economy Are remote workers the brains inside tomorrowโ€™s robots?

Wrote a little something on this in 2012, though I didn't anticipate the main reason for hiring such workers - training data.

www.technologyreview.com/2012/07/18/1...

03.12.2024 13:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Ohmydeargod.

03.12.2024 10:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

David Meyer (v.) /หˆdeษชvษชd หˆmaษช.ษ™r/

To attribute complex, intentional design or deeper meaning to simple emergent behaviors of large language models, especially when such behaviors are more likely explained by straightforward technical constraints or training artifacts.

03.12.2024 10:53 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

They did NOT. Wow. Sign of the times.

And I can verify on your rule! I was so flabbergasted and honored. Your feedback was rich and so helpful. Remain grateful.

03.12.2024 01:19 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I remember *treasuring* the previews. I'd fight to get there on time. Was part of the thrill.

But ads? F*ck that noise. Seriously, straight up evil.

30.11.2024 20:04 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Never occurred to me there'd be an algo under the hood that could reliably learn to provide content I'd value more than a straight read of my hand-curated list of people. My solution has been following people if they post high signal stuff all the time.

30.11.2024 18:12 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I have never used the feed page. What a horror, can't quite understand why folks would try.

Only/ever the "following" page. Even there things got pretty intolerable towards/around the election, now settled down.

30.11.2024 17:51 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Kurt Vonnegut, Joe Heller, and How to Think Like a Mensch This story remains my favorite Thanksgiving message; it reminds me to be grateful for what I have and of the evils of jealousy and destructive competition. I first posted it on my work matters blog mo...

My Thanksgiving post. A Kurt Vonnegut poem. He talks with Joe Heller (Catch 22 fame) about a billionaire. Key part:

Joe said, "I've got something he can never have"

And I said, "What on earth could that be, Joe?"

And Joe said, "The knowledge that I've got enough"

www.linkedin.com/pulse/kurt-v...

27.11.2024 19:40 โ€” ๐Ÿ‘ 12    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Oh my dear god this is an incredible study.

27.11.2024 19:04 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I think there's likely an effect there!

25.11.2024 22:13 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@mattbeane is following 20 prominent accounts