Sasha Rush's Avatar

Sasha Rush

@srushnlp.bsky.social

Professor, Programmer in NYC. Cornell, Hugging Face πŸ€—

7,102 Followers  |  47 Following  |  37 Posts  |  Joined: 04.10.2023  |  2.2987

Latest posts by srushnlp.bsky.social on Bluesky

Talk By No abstract available.

If you're in Berkeley or like a nice streamed talk, I'm about to give a talk at the Simons Institute today: "You Know It Or You Don’t: Compositionality and Phase Transitions in LMs". Tune in at 4PM pacific!

05.02.2025 19:49 β€” πŸ‘ 21    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

I'm hanging around with Theorists πŸ€“

04.02.2025 16:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
How DeepSeek Changes the LLM Story
YouTube video by Sasha Rush πŸ€— How DeepSeek Changes the LLM Story

What to know about DeepSeek

youtu.be/0eMzc-WnBfQ?...

In which we attempt to figure out MoE, o1, scaling, tech reporting, modern semiconductors, microeconomics, and international geopolitics.

04.02.2025 15:41 β€” πŸ‘ 95    πŸ” 13    πŸ’¬ 1    πŸ“Œ 5

These are great recommendations thank you.

30.01.2025 18:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

For reasons, I find myself thinking a lot about the history of US/USSR Cold War science, particularly in applied math. Does anyone have a recommendation for a good book on this topic?

30.01.2025 17:04 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 4    πŸ“Œ 0

Yeah vertical is kind of dumb, but I thought I try it out.

07.01.2025 21:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ˜†, I noticed I have trouble saying that word.

however if you listen to your own videos then you will never manage to release anything.

07.01.2025 16:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Flash LLMs: Pipeline Parallel
YouTube video by Sasha Rush πŸ€— Flash LLMs: Pipeline Parallel

10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437)

www.youtube.com/watch?v=76gu...

07.01.2025 15:01 β€” πŸ‘ 28    πŸ” 4    πŸ’¬ 4    πŸ“Œ 1

Thought this "Bill Gates" guy was on the level.

07.01.2025 03:19 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I'll try it out. Good to check once a year to see if I'm secretly an existential risk guy.

07.01.2025 03:17 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I mean Microsoft under the table bought his company, right?

Luckily there is no indication from the review he read the book.

06.01.2025 22:22 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Maybe I'll just use bluesky to rant about topics I'm too scared to talk about on twitter.

06.01.2025 22:17 β€” πŸ‘ 26    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

The casual conflation of AI with gene editing is intellectual malpractice. These two things have nothing to do with each other!

06.01.2025 22:16 β€” πŸ‘ 15    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

Would love to read a good book about this topic if anyone wants to give it a try.

06.01.2025 22:14 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

I've been listening to too much If Books Could Kill, so now I'm convinced these airport books are actually the only thing that matters.

06.01.2025 22:13 β€” πŸ‘ 14    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I tried reading this book, and I was just shocked at how little insight it had, and it's sheer inability to focus. The fact that it is being recommended to policy makers...

www.gatesnotes.com/The-Coming-W...

06.01.2025 22:12 β€” πŸ‘ 39    πŸ” 1    πŸ’¬ 10    πŸ“Œ 1
Python + WebGPU
YouTube video by Sasha Rush πŸ€— Python + WebGPU

I'm going to do a live coding stream for the next couple of hours. We'll start by running through some WebGPU tutorials. Can also talk about some AI stuff.

www.youtube.com/watch?v=sqKq...

27.12.2024 19:00 β€” πŸ‘ 21    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute πŸ”₯

How? By combining step-wise reward models with tree search algorithms :)

We're open sourcing the full recipe and sharing a detailed blog post πŸ‘‡

16.12.2024 17:08 β€” πŸ‘ 109    πŸ” 21    πŸ’¬ 4    πŸ“Œ 1

huh, so maybe OCaml should be the target for verifiable generation? I heard you guys have ways to build fast

12.12.2024 16:37 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

As coding LLMs get faster at inference, iterating verification-in-the-loop tests becomes the bottleneck for coding agents. Probably need quite different programming systems for these settings, or even things like "batchable" runtimes, whatever that means.

11.12.2024 18:43 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

We organised a lively poster fest with many students rehearsing for the upcoming @neuripsconf.bsky.social next week and others discussing their cool works!

Thanks to #GAIL, the #Generative #AI lab in #Edinburgh for sponsoring the event!

06.12.2024 20:43 β€” πŸ‘ 44    πŸ” 7    πŸ’¬ 0    πŸ“Œ 1
Post image Post image 06.12.2024 02:28 β€” πŸ‘ 31    πŸ” 3    πŸ’¬ 5    πŸ“Œ 0
Post image

I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4

05.12.2024 15:15 β€” πŸ‘ 141    πŸ” 21    πŸ’¬ 3    πŸ“Œ 2
Screenshot of BBC 100 picture of Sasha and blurb; linked in post.

Screenshot of BBC 100 picture of Sasha and blurb; linked in post.

Proud of my amazing colleague @sashamtl.bsky.social for her much deserved recognition on advancing the science of AI energy use.
BBC100: www.bbc.co.uk/news/resourc...
Fast Company: www.fastcompany.com/91233692/why...
Sasha has been working tirelessly moving things fwd--endurance & brilliance in one.

03.12.2024 18:47 β€” πŸ‘ 34    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0
Post image

NEW: we have an exciting opportunity for a tenure-track professor at the #KempnerInstitute and the John A. Paulson School of Engineering and Applied Sciences (SEAS). Read the full description & apply today: academicpositions.harvard.edu/postings/14362
#ML #AI

03.12.2024 01:24 β€” πŸ‘ 20    πŸ” 19    πŸ’¬ 0    πŸ“Œ 1
Post image

We're hiring another predoctoral researcher for my team at Ai2/OLMo next year. The goal of this position is to mentor and grow future academic stars of NLP/AI over 1-2 years before grad school.

This ends up being people done with BS or MS who want to continue to a PhD soon.
https://buff.ly/49nuggo

03.12.2024 23:52 β€” πŸ‘ 54    πŸ” 7    πŸ’¬ 6    πŸ“Œ 1
x.com

Answers from Twitter x.com/srush_nlp/st...

02.12.2024 19:05 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Unfortunately Yoav's question is a bit more interesting and subtle than this talk.

02.12.2024 15:44 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

πŸ™

02.12.2024 15:00 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Is there a community that writes RL-first programming languages? Something like (Num)Pyro that takes seriously the idea of separating the policy specification from the learning process.

02.12.2024 14:24 β€” πŸ‘ 16    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

@srushnlp is following 18 prominent accounts