Talk By
No abstract available.
If you're in Berkeley or like a nice streamed talk, I'm about to give a talk at the Simons Institute today: "You Know It Or You Donβt: Compositionality and Phase Transitions in LMs". Tune in at 4PM pacific!
05.02.2025 19:49 β π 21 π 3 π¬ 1 π 0
I'm hanging around with Theorists π€
04.02.2025 16:58 β π 0 π 0 π¬ 1 π 0
YouTube video by Sasha Rush π€
How DeepSeek Changes the LLM Story
What to know about DeepSeek
youtu.be/0eMzc-WnBfQ?...
In which we attempt to figure out MoE, o1, scaling, tech reporting, modern semiconductors, microeconomics, and international geopolitics.
04.02.2025 15:41 β π 95 π 13 π¬ 1 π 5
These are great recommendations thank you.
30.01.2025 18:51 β π 1 π 0 π¬ 0 π 0
For reasons, I find myself thinking a lot about the history of US/USSR Cold War science, particularly in applied math. Does anyone have a recommendation for a good book on this topic?
30.01.2025 17:04 β π 6 π 1 π¬ 4 π 0
Yeah vertical is kind of dumb, but I thought I try it out.
07.01.2025 21:49 β π 0 π 0 π¬ 0 π 0
π, I noticed I have trouble saying that word.
however if you listen to your own videos then you will never manage to release anything.
07.01.2025 16:01 β π 2 π 0 π¬ 0 π 0
YouTube video by Sasha Rush π€
Flash LLMs: Pipeline Parallel
10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437)
www.youtube.com/watch?v=76gu...
07.01.2025 15:01 β π 28 π 4 π¬ 4 π 1
Thought this "Bill Gates" guy was on the level.
07.01.2025 03:19 β π 3 π 0 π¬ 0 π 0
I'll try it out. Good to check once a year to see if I'm secretly an existential risk guy.
07.01.2025 03:17 β π 2 π 0 π¬ 0 π 0
I mean Microsoft under the table bought his company, right?
Luckily there is no indication from the review he read the book.
06.01.2025 22:22 β π 2 π 0 π¬ 0 π 0
Maybe I'll just use bluesky to rant about topics I'm too scared to talk about on twitter.
06.01.2025 22:17 β π 26 π 1 π¬ 2 π 0
The casual conflation of AI with gene editing is intellectual malpractice. These two things have nothing to do with each other!
06.01.2025 22:16 β π 15 π 0 π¬ 3 π 0
Would love to read a good book about this topic if anyone wants to give it a try.
06.01.2025 22:14 β π 5 π 0 π¬ 1 π 1
I've been listening to too much If Books Could Kill, so now I'm convinced these airport books are actually the only thing that matters.
06.01.2025 22:13 β π 14 π 0 π¬ 1 π 0
I tried reading this book, and I was just shocked at how little insight it had, and it's sheer inability to focus. The fact that it is being recommended to policy makers...
www.gatesnotes.com/The-Coming-W...
06.01.2025 22:12 β π 39 π 1 π¬ 10 π 1
YouTube video by Sasha Rush π€
Python + WebGPU
I'm going to do a live coding stream for the next couple of hours. We'll start by running through some WebGPU tutorials. Can also talk about some AI stuff.
www.youtube.com/watch?v=sqKq...
27.12.2024 19:00 β π 21 π 4 π¬ 1 π 0
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute π₯
How? By combining step-wise reward models with tree search algorithms :)
We're open sourcing the full recipe and sharing a detailed blog post π
16.12.2024 17:08 β π 109 π 21 π¬ 4 π 1
huh, so maybe OCaml should be the target for verifiable generation? I heard you guys have ways to build fast
12.12.2024 16:37 β π 2 π 0 π¬ 2 π 0
As coding LLMs get faster at inference, iterating verification-in-the-loop tests becomes the bottleneck for coding agents. Probably need quite different programming systems for these settings, or even things like "batchable" runtimes, whatever that means.
11.12.2024 18:43 β π 11 π 0 π¬ 2 π 0
We organised a lively poster fest with many students rehearsing for the upcoming @neuripsconf.bsky.social next week and others discussing their cool works!
Thanks to #GAIL, the #Generative #AI lab in #Edinburgh for sponsoring the event!
06.12.2024 20:43 β π 44 π 7 π¬ 0 π 1
06.12.2024 02:28 β π 31 π 3 π¬ 5 π 0
I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4
05.12.2024 15:15 β π 141 π 21 π¬ 3 π 2
Screenshot of BBC 100 picture of Sasha and blurb; linked in post.
Proud of my amazing colleague @sashamtl.bsky.social for her much deserved recognition on advancing the science of AI energy use.
BBC100: www.bbc.co.uk/news/resourc...
Fast Company: www.fastcompany.com/91233692/why...
Sasha has been working tirelessly moving things fwd--endurance & brilliance in one.
03.12.2024 18:47 β π 34 π 5 π¬ 2 π 0
NEW: we have an exciting opportunity for a tenure-track professor at the #KempnerInstitute and the John A. Paulson School of Engineering and Applied Sciences (SEAS). Read the full description & apply today: academicpositions.harvard.edu/postings/14362
#ML #AI
03.12.2024 01:24 β π 20 π 19 π¬ 0 π 1
We're hiring another predoctoral researcher for my team at Ai2/OLMo next year. The goal of this position is to mentor and grow future academic stars of NLP/AI over 1-2 years before grad school.
This ends up being people done with BS or MS who want to continue to a PhD soon.
https://buff.ly/49nuggo
03.12.2024 23:52 β π 54 π 7 π¬ 6 π 1
x.com
Answers from Twitter x.com/srush_nlp/st...
02.12.2024 19:05 β π 5 π 0 π¬ 1 π 0
Unfortunately Yoav's question is a bit more interesting and subtle than this talk.
02.12.2024 15:44 β π 4 π 0 π¬ 3 π 0
π
02.12.2024 15:00 β π 1 π 0 π¬ 0 π 0
Is there a community that writes RL-first programming languages? Something like (Num)Pyro that takes seriously the idea of separating the policy specification from the learning process.
02.12.2024 14:24 β π 16 π 0 π¬ 2 π 0
Former news editor, current tech R&D nerd. Always high quality, internet-grade posting
Editor @ MachinesonPaper www.machinesonpaper.com
PI at Helmholtz AI, Faculty at TU Munich, Fellow at Zuse School for reliable AI, Branco Weiss Fellow, ELLIS Scholar.
Prev: Cambridge CBL, St John's College, ETH ZΓΌrich, Google Brain, Microsoft Research, Disney Research.
https://fortuin.github.io/
Machine learning & statistics researcher @ Flatiron Institute. Posts on probabilistic ML, Bayesian statistics, decision making, and AI/ML for science.
www.dianacai.com
Associate Professor (UHD) at the University of Amsterdam. Probabilistic methods, deep learning, and their applications in science in engineering.
human being | assoc prof in #ML #AI #Edinburgh | PI of #APRIL | #reliable #probabilistic #models #tractable #generative #neuro #symbolic | heretical empiricist | he/him
π https://april-tools.github.io
Always pondering startups, ML, Rust, Python, and 3D printing.
Independent ML researcher consulting on LMs + data.
Previously: Salesforce Research, MetaMind, CommonCrawl, Harvard. π¦πΊ in SF. He/him.
Personal blog: https://state.smerity.com
Research Scientist, Google DeepMind / Ex-academic / Deep learning to help people write code / β€οΈs:π±πΆβοΈπ
NLP PhD @ USC
Previously at AI2, Harvard
mattf1n.github.io
Associate professor at IT University of Copenhagen: NLP, language models, interpretability, AI & society. Co-editor-in-chief of ACL Rolling Review. #NLProc #NLP
Researcher trying to shape AI towards positive outcomes. ML & Ethics +birds. Generally trying to do the right thing. TIME 100 | TED speaker | Senate testimony provider | Navigating public life as a recluse.
Former: Google, Microsoft; Current: Hugging Face
Safe and robust AI/ML, computational sustainability. Former President AAAI and IMLS. Distinguished Professor Emeritus, Oregon State University. https://web.engr.oregonstate.edu/~tgd/
A LLN - large language Nathan - (RL, RLHF, society, robotics), athlete, yogi, chef
Writes http://interconnects.ai
At Ai2 via HuggingFace, Berkeley, and normal places
βοΈ Assistant Professor of Computer Science at CU Boulder
π©βπ» NLP, cultural analytics
π https://maria-antoniak.github.io
Previously: Pioneer Centre for AI in Copenhagen, Ai2, Microsoft Research, Twitter, Facebook, Cornell, UW
Open source developer building tools to help journalists, archivists, librarians and others analyze, explore and publish their data. https://datasette.io [β¦]
[bridged from https://fedi.simonwillison.net/@simon on the fediverse by https://fed.brid.gy/ ]
natural language processing and computational linguistics at google deepmind.
Building ventures. Educating leaders. Creating new technology. All at #CornellTech