Adam Santoro's Avatar

Adam Santoro

@adamsantoro.bsky.social

Scientist @ Google DeepMind

1,196 Followers  |  157 Following  |  8 Posts  |  Joined: 19.11.2024  |  1.6036

Latest posts by adamsantoro.bsky.social on Bluesky

Preview
Research Engineer, Generative AI London, UK

My team is hiring! PRISM🌈 (Planning, Reasoning, Inference & Structured Models), which I co-lead with @theophane.bsky.social, is hiring a Research Engineer to help us develop planning and reasoning capabilities in Gemini. Apply here: boards.greenhouse.io/deepmind/job...

14.11.2024 15:14 β€” πŸ‘ 38    πŸ” 9    πŸ’¬ 0    πŸ“Œ 2

It's your unique toolset that will make your music authentic. So follow your own path, and be driven by achievements that you personally find valuable, whether that includes technical mastery on piano or not

23.11.2024 16:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Stripe Press β€” The Art of Doing Science and Engineering A groundbreaking treatise by one of the great mathematicians of our time, who argues that highly effective thinking can be learned.

You've probably read Hamming's essay You and Your Research, but perhaps not his book, The Art of Doing Science and Engineering. The edition by Stripe Press is absolutely stunning:

press.stripe.com/the-art-of-d...

22.11.2024 19:28 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

A more charitable concern that I think fans have is that St Louis is spinning a modified version of the hybrid system, and is failing to communicate the changes he expects

21.11.2024 19:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ya basically. It's then a question of how much data it would need to infer the meaning (via context or through parameter updates). If we think about how we'd speak and what we'd speak about in 1000 years then a current LLM would struggle mightily, even if the symbols are the same

21.11.2024 18:48 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Why does learning require changes to parameters? ICL in this regard is no different from meta-learning, wherein it's activation dynamics that underpin learning

20.11.2024 23:19 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I imagine this insight has a natural home in the embodied cognition school of thought. The same could probably be said about "attention", and arguably even "reasoning"

20.11.2024 17:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I'd argue the opposite: by virtue of being symbolic (i.e., meaning is established by convention, not by any fundamental and objective truth), it's trivial to go OOD with language

20.11.2024 17:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

By virtue of being symbolic, language can always present new concepts or abstractions that aren't captured by any previous descriptions, making it trivial to conjure new OOD stuff

20.11.2024 17:36 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Re: the scale is dead debate. Isn't it pretty obvious that just scaling is never going to work if your method breaks down on OOD inputs? The world is non-stationary, so it's constantly presenting new OOD inputs.

20.11.2024 16:55 β€” πŸ‘ 45    πŸ” 4    πŸ’¬ 6    πŸ“Œ 1

@adamsantoro is following 19 prominent accounts