's Avatar

@andrew-n-carr.bsky.social

co-founder leading science at Cartwheel AI writer for TLDR AI Newsletter co-founder Arcade Past - Codegen at OpenAI, Brain at GoogleAI, world ranked Tetris player

823 Followers  |  127 Following  |  26 Posts  |  Joined: 16.11.2024  |  1.5992

Latest posts by andrew-n-carr.bsky.social on Bluesky

Post image

I thought this was an interesting graphic

16.02.2025 22:37 β€” πŸ‘ 43    πŸ” 6    πŸ’¬ 3    πŸ“Œ 2
Video thumbnail

They did it all without Jira... Amazing

26.12.2024 02:55 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

the mean of a distribution is the point that minimizes the average squared difference of points drawn from that distribution

I've never thought of mean as an argmin before, but it's a neat framing!

10.12.2024 23:10 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
IC-Light V2 (Flux-based IC-Light models) Β· lllyasviel IC-Light Β· Discussion #98 Note that this post is a work in progress (wip). Maybe I will edit it a lot recently. IC-Light V2 is a series of Flux-based IC-Light models with 16ch VAE and native high resolution. We plan to have...

The top rated iclr paper (relight) is amazing.

Ic light 2 is also out on GitHub

github.com/lllyasviel/I...

Based on the flux suite of models and has stunning results

01.12.2024 01:57 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

good workflow

prompt r1-preview -> refine
copy all reasoning traces to claude -> prompt again
copy output and original prompt to o1-preview -> verify

This essentially solves every problem I've thrown at it from linguistic to mathematic.

27.11.2024 15:16 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Genmo has released LoRA training capabilities for their generative video model Mochi

github.com/genmoai/moch...

Trains quickly on a single 80GB GPU.

27.11.2024 00:24 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I am anxious to get my hands on r1 and grok 3.

I've heard some big moves are coming first two weeks of December from oai, Anthropic, and Gemini - but I'm more excited about these other two.

They feel meaningfully orthogonal from approach and group dynamics

24.11.2024 20:29 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Yeah, I think that's because Gemini live uses Gemini flash, which is a weaker underlying model

24.11.2024 02:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Gemini live is essentially just as good as advanced voice from oai. And no one is talking about either

24.11.2024 01:41 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 3    πŸ“Œ 0

This is awesome stuff

24.11.2024 01:40 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hopefully I'll have a little 4 page thing up soon-ish, holiday project

23.11.2024 15:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Wasserstein Expectation Maximization! Using OT distance in the M step and then a convergence proof

23.11.2024 15:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I've been noodling on a math problem since 2018 or so. I think I finally cracked it after a couple hours with r1-lite

23.11.2024 04:43 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Cool new paper from NVIDIA about a hybrid state space + attention model that performs extremely well as a small model. Their 1.5B model even out performs Llama 3.2 3B

arxiv: arxiv.org/abs/2411.13676

22.11.2024 20:27 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Great list!!

22.11.2024 17:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ‘€

21.11.2024 14:47 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.

20.11.2024 15:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

20.11.2024 15:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
DeepSeek Chat with DeepSeek AI.

DeepSeek-R1-Lite-Preview is deepseeks answer to o1.

πŸ” o1-preview-level performance on AIME & MATH benchmarks.
πŸ’‘ Transparent thought process in real-time.
πŸ› οΈ Open-source models & API coming soon!

🌐 Try it now at chat.deepseek.com

20.11.2024 15:36 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Fun probability fact, the likelihood that two randomly drawn numbers are coprime is 61%!

20.11.2024 01:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I have nothing to say. Just enjoy this validation loss curve for a moment

19.11.2024 23:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Where are my AI friends at?

19.11.2024 23:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Are there turn key machine shops?

Just pay $ and get an automated, garage sized, workshop?

19.11.2024 03:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

When deep learning start ups exit:

Marble floors in Monaco glass
Wrist so frozen, yeah it's built to last
Future vision through a tinted mask
Private hangars where I count my stash
Every move calculated like math
Pull up in that Phantom, tinted glass
Stack them queries deep with this KV cash

16.11.2024 15:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

my favorite phrase to hear when interviewing scientists

"and this is the point where I would ask claude ..."

16.11.2024 03:28 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hello world

16.11.2024 01:55 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@andrew-n-carr is following 20 prominent accounts