Max Little's Avatar

Max Little

@maxal.bsky.social

Academic mathematician/computer scientist, University of Birmingham, UK. AI and machine learning, causal inference, signal processing, applied mathematics, computational statistics. Ex Oxford PhD, MIT postdoc fellow.

115 Followers  |  325 Following  |  16 Posts  |  Joined: 05.09.2024  |  2.026

Latest posts by maxal.bsky.social on Bluesky

Predictions Scorecard, 2025 January 01 โ€“ Rodney Brooks

Every Jan 1 I post a scorecard on predictions I made, with dates, on Jan 1, 2018 on cars (self-driving), robots, AI, & ML, and on human spaceflight. Besides telling which turned out right and which wrong in the last year I also talk a lot of smack about these topics. rodneybrooks.com/predictions-...

01.01.2025 07:37 โ€” ๐Ÿ‘ 110    ๐Ÿ” 45    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 12
Preview
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Do large language models (LLMs) solve reasoning tasks by learning robust generalizable algorithms, or do they memorize training data? To investigate this question, we use arithmetic reasoning as a rep...

LLMs can be also seen as big bags of heuristics: arxiv.org/abs/2410.21272

21.12.2024 17:46 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
U.S. science funding agencies roll out policies on free access to journal articles NIH and DOE are first to act, with implementation by all set to begin by end of 2025

www.science.org/content/arti...

21.12.2024 11:56 โ€” ๐Ÿ‘ 13    ๐Ÿ” 8    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

๐—ข๐Ÿฏ ๐˜„๐—ฎ๐˜€ ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ฒ๐—ฑ ๐—ผ๐—ป ๐Ÿณ๐Ÿฑ% ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ฝ๐˜‚๐—ฏ๐—น๐—ถ๐—ฐ ๐˜€๐—ฒ๐˜ ๐—ณ๐—ผ๐—ฟ ๐—”๐—ฅ๐—–-๐—”๐—š๐—œ.

OpenAI did not disclose this in the video. Sam said they didnโ€™t target the test.

Never trust a staged demo.
Never trust a product you havenโ€™t tried.
Never trust OpenAI.

21.12.2024 22:06 โ€” ๐Ÿ‘ 395    ๐Ÿ” 56    ๐Ÿ’ฌ 26    ๐Ÿ“Œ 15
Preview
o3, AGI, the art of the demo, and what you can expect in 2025 OpenAIโ€™s new model was revealed yesterday; its most fervent believers think AGI has already arrived. Hereโ€™s what you should pay attention to in the coming year.

o3, AGI, and the art of the demo. Long read on what OpenAI didnโ€™t tell you yesterday. garymarcus.substack.com/p/o3-agi-the...

21.12.2024 15:31 โ€” ๐Ÿ‘ 64    ๐Ÿ” 12    ๐Ÿ’ฌ 9    ๐Ÿ“Œ 3

Likewise, a simple adversarial strategy beats "superhuman" Go-playing algorithms: goattack.far.ai It's wise to remember that there is no scientific consensus on what "intelligence", actually is.

21.12.2024 18:31 โ€” ๐Ÿ‘ 5    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Just for those who don't know: the vast majority of open problems in maths, are not numerical in nature.

21.12.2024 11:55 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The questions have numerical answers, so it is easy to check whether it gets them right.

21.12.2024 09:17 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

How many times do we have to see this same movie, where an AI beats some benchmark and influencers gleefully shout โ€œItโ€™s So Overโ€ without even trying out the AI and then on careful inspection the AI turns out to not be robust or reliable?

Thousands?

(Itโ€™s already been hundreds.)

21.12.2024 00:59 โ€” ๐Ÿ‘ 75    ๐Ÿ” 9    ๐Ÿ’ฌ 7    ๐Ÿ“Œ 1

It seems that OpenAI's latest model, o3, can solve 25% of problems on a database called FrontierMath, created by EpochAI, where previous LLMs could only solve 2%. On Twitter I am quoted as saying, "Getting even one question right would be well beyond what we can do now, let alone saturating them."

20.12.2024 23:15 โ€” ๐Ÿ‘ 87    ๐Ÿ” 8    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 1
Preview
Scholars Are Supposed to Say When They Use AI. Do They? Journals have policies about disclosing ChatGPT writing, but enforcing them is another matter, according to a new study.

It's widely agreed that scholars are supposed to say when they use ChatGPT. Yet phrases like "I am an AI language model"โ€”with no disclosureโ€”are popping up in papers.

I wrote about how journals seemingly aren't enforcing their AI policies, according to a new study: www.chronicle.com/article/scho...

18.12.2024 21:02 โ€” ๐Ÿ‘ 52    ๐Ÿ” 22    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 5
Preview
Is AI progress slowing down? Making sense of recent technology trends and claims

This seems like a pretty balanced commentary. They certainly get this right: "connection between capability improvements & AIโ€™s social or economic impacts is extremely weak. The bottlenecks for impact are the pace of product development and the rate of adoption" www.aisnakeoil.com/p/is-ai-prog...

18.12.2024 20:46 โ€” ๐Ÿ‘ 18    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Good reporting here, but sadly, these tragedies were predictable. Those of us who actually work on machine learning know that deep-learning based computer vision simply isn't reliable enough for safety-critical applications such as self-driving cars. @garymarcus.bsky.social @filippie509.bsky.social

17.12.2024 16:09 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
When does generative AI qualify for fair use?

The late Suchir Balajiโ€™s blog post on AI, copyright and fair use, reposted in his memory.

suchir.net/fair_use.html

14.12.2024 06:07 โ€” ๐Ÿ‘ 125    ๐Ÿ” 37    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 4
The bootstrap can be used to generate a new random sample from an existing random sample. It's validity can be guaranteed by the Glivenko-Cantelli theorem, which demonstrates how the empirical CDF (top panel), converges on the CDF of the sample (bottom panel).

The bootstrap can be used to generate a new random sample from an existing random sample. It's validity can be guaranteed by the Glivenko-Cantelli theorem, which demonstrates how the empirical CDF (top panel), converges on the CDF of the sample (bottom panel).

The bootstrap can be used to generate a new random sample from an existing random sample. Its validity can be guaranteed by the Glivenko-Cantelli theorem, which demonstrates how the empirical cumulative distribution (CDF, top panel), converges on the CDF of the sample (bottom panel).

14.12.2024 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

For an increasing function ๐‘“:โ„โ†’โ„, max(๐‘“(๐‘Ž),๐‘“(๐‘))=๐‘“(max(๐‘Ž,๐‘)). An important special case is ๐‘“(๐‘ฅ)=๐‘ฅ+๐‘, for which we obtain max(๐‘Ž+๐‘,๐‘+๐‘)=๐‘+max(๐‘Ž,๐‘).

14.12.2024 00:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I believe GM came to exactly this is the realization and decided (likely very wisely, in my opinion) not to throw more good money after bad.

14.12.2024 01:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Since 2016 Waymo raised ~$25B, so they burn ~$3B/y or little over 8mln/day. With ~700 cars, assuming they operate each car every day, it costs them over 11k dollars to operate each of their cars per day. $11k PER DAY per CAR. If you don't find this ridiculous IDK what else to say.

14.12.2024 00:04 โ€” ๐Ÿ‘ 8    ๐Ÿ” 3    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

Suchir Balaji was a good young man. I spoke to him six weeks ago. He had left OpenAI and wanted to make the world a better place. This is tragic.

14.12.2024 00:19 โ€” ๐Ÿ‘ 162    ๐Ÿ” 46    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 4

Very proud of the Birmingham HDRUK PhDs!

13.12.2024 23:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image Post image Post image

Health Data Research UK PhD meet! Work from Ant Lee and Jianqiao Mao (latter with @maxal.bsky.social)

13.12.2024 11:26 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Apple "Intelligence". @garymarcus.bsky.social

13.12.2024 22:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

And, not usually mentioned is just how many "non-driver" human roles Waymo are heavily relying upon, e.g. teleoperation, stuck vehicle retrieval, repairs, maintainence, cleaning, passenger support etc. @rodneyabrooks.bsky.social

10.12.2024 23:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

As first predicted some 10 years ago that is how "self driving cars" will end - as glorified driver assistance features. The graveyard of autonomous vehicle efforts is pretty crowded already with pretty much only Waymo remaining, until life support from Google mothership ends.

10.12.2024 21:39 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

What if all the hype just didnโ€™t turn out to be true?

Evidence of productivity gains is mixed - yet hypey takes continue to dominate in the media.

09.12.2024 19:53 โ€” ๐Ÿ‘ 43    ๐Ÿ” 9    ๐Ÿ’ฌ 7    ๐Ÿ“Œ 0
Preview
Donโ€™t Ride This Bike! Generative AIโ€™s persistent trouble with compositionality and parts When the text-to-image AI generation system DALL-E2 was released in April 2022, the two of us, together with Scott Aaronson, ran some informal experiments to probe its abilities.

Donโ€™t Ride This Bike! Generative AIโ€™s persistent trouble with compositionality and parts, by Gary Marcus @garymarcus.bsky.social and Ernest Davis / Marcus on AI - Substack garymarcus.substack.com/p/dont-ride-...

08.12.2024 23:57 โ€” ๐Ÿ‘ 11    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Most of these sorts of algorithms are just AI snake oil: they don't work because there is no way to quantify these sorts of 'social variables'. They are never actually tested to any level of scientific rigour.

06.12.2024 18:38 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Not quite: AI got people excited about interpolation, it seems. Numerical analysts suddenly feel seen.

02.12.2024 07:07 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@garymarcus.bsky.social

29.11.2024 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Fully-funded PhD position available. If you are interested in machine learning for signal processing of biosignals, please do get in touch.

27.11.2024 13:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@maxal is following 20 prominent accounts