I thought this was an interesting graphic
16.02.2025 22:37 β π 43 π 6 π¬ 3 π 2@andrew-n-carr.bsky.social
co-founder leading science at Cartwheel AI writer for TLDR AI Newsletter co-founder Arcade Past - Codegen at OpenAI, Brain at GoogleAI, world ranked Tetris player
I thought this was an interesting graphic
16.02.2025 22:37 β π 43 π 6 π¬ 3 π 2They did it all without Jira... Amazing
26.12.2024 02:55 β π 8 π 0 π¬ 0 π 0the mean of a distribution is the point that minimizes the average squared difference of points drawn from that distribution
I've never thought of mean as an argmin before, but it's a neat framing!
The top rated iclr paper (relight) is amazing.
Ic light 2 is also out on GitHub
github.com/lllyasviel/I...
Based on the flux suite of models and has stunning results
good workflow
prompt r1-preview -> refine
copy all reasoning traces to claude -> prompt again
copy output and original prompt to o1-preview -> verify
This essentially solves every problem I've thrown at it from linguistic to mathematic.
Genmo has released LoRA training capabilities for their generative video model Mochi
github.com/genmoai/moch...
Trains quickly on a single 80GB GPU.
I am anxious to get my hands on r1 and grok 3.
I've heard some big moves are coming first two weeks of December from oai, Anthropic, and Gemini - but I'm more excited about these other two.
They feel meaningfully orthogonal from approach and group dynamics
Yeah, I think that's because Gemini live uses Gemini flash, which is a weaker underlying model
24.11.2024 02:55 β π 2 π 0 π¬ 0 π 0Gemini live is essentially just as good as advanced voice from oai. And no one is talking about either
24.11.2024 01:41 β π 12 π 1 π¬ 3 π 0This is awesome stuff
24.11.2024 01:40 β π 3 π 0 π¬ 0 π 0Hopefully I'll have a little 4 page thing up soon-ish, holiday project
23.11.2024 15:34 β π 1 π 0 π¬ 0 π 0Wasserstein Expectation Maximization! Using OT distance in the M step and then a convergence proof
23.11.2024 15:34 β π 0 π 0 π¬ 0 π 0I've been noodling on a math problem since 2018 or so. I think I finally cracked it after a couple hours with r1-lite
23.11.2024 04:43 β π 7 π 0 π¬ 1 π 0Cool new paper from NVIDIA about a hybrid state space + attention model that performs extremely well as a small model. Their 1.5B model even out performs Llama 3.2 3B
arxiv: arxiv.org/abs/2411.13676
Great list!!
22.11.2024 17:12 β π 0 π 0 π¬ 0 π 0π
21.11.2024 14:47 β π 2 π 0 π¬ 0 π 0Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.
Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!
20.11.2024 15:36 β π 0 π 0 π¬ 1 π 0DeepSeek-R1-Lite-Preview is deepseeks answer to o1.
π o1-preview-level performance on AIME & MATH benchmarks.
π‘ Transparent thought process in real-time.
π οΈ Open-source models & API coming soon!
π Try it now at chat.deepseek.com
Fun probability fact, the likelihood that two randomly drawn numbers are coprime is 61%!
20.11.2024 01:56 β π 0 π 0 π¬ 0 π 0I have nothing to say. Just enjoy this validation loss curve for a moment
19.11.2024 23:09 β π 0 π 0 π¬ 0 π 0Where are my AI friends at?
19.11.2024 23:06 β π 2 π 0 π¬ 1 π 0Are there turn key machine shops?
Just pay $ and get an automated, garage sized, workshop?
When deep learning start ups exit:
Marble floors in Monaco glass
Wrist so frozen, yeah it's built to last
Future vision through a tinted mask
Private hangars where I count my stash
Every move calculated like math
Pull up in that Phantom, tinted glass
Stack them queries deep with this KV cash
my favorite phrase to hear when interviewing scientists
"and this is the point where I would ask claude ..."
Hello world
16.11.2024 01:55 β π 4 π 0 π¬ 0 π 0