The first fantastic paper on scaling RL with LLMs just dropped. I strongly recommend taking a look and will be sharing more thoughts on the blog soon.
The Art of Scaling Reinforcement Learning Compute for LLMs
Khatri & Madaan et al.
buff.ly/olKwF3X
16.10.2025 13:59 β π 18 π 1 π¬ 0 π 1
Multi-Head Latent Attention
π github.com/rasbt/LLMs-f...
12.10.2025 13:57 β π 43 π 6 π¬ 0 π 1
β οΈ You have marked yourself as an untrusted node in the epistemic network
11.10.2025 13:55 β π 99 π 10 π¬ 6 π 2
common misconception! flash attn is still all-to-all and isomorphic to vanilla self-attention
(optimized matrix ops have to be decomposed into tiles for memory hierarchy reasons, and ideally fused - multiple algorithmic steps on one tile. FA just does this best, esp the tricky-to-fuse softmax step)
11.10.2025 06:55 β π 4 π 0 π¬ 1 π 0
You get small KL divergence from the base model without extra regularization here, since the search is local
and most surprisingly, this approach even handily beats (a grid-search tuned implementation of) GRPO, at least in this work + problem context
07.10.2025 17:02 β π 0 π 0 π¬ 0 π 0
improving pretrained LLMs by searching over iid-noised params, using a reward score (aka fitness criterion) for weight-merging
07.10.2025 17:02 β π 0 π 0 π¬ 1 π 0
We are excited to announce 4 outstanding papers ππππ --> π§΅
07.10.2025 13:22 β π 10 π 4 π¬ 1 π 1
LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories.
π arxiv.org/abs/2510.02375
[1/10]π§΅
06.10.2025 16:06 β π 12 π 4 π¬ 1 π 0
accepted papers, COLM 2025
colmweb.org/AcceptedPape...
06.10.2025 15:39 β π 1 π 0 π¬ 0 π 0
Paper: arxiv.org/pdf/2509.20328
03.10.2025 13:05 β π 15 π 2 π¬ 1 π 0
Spaced Scheduling for Large Language Model Training
Amine El hattami, Nicolas Chapados, Christopher Pal
Action editor: Colin Raffel
https://openreview.net/forum?id=p0KTYl2B9T
#scheduling #scheduled #training
02.10.2025 04:18 β π 2 π 1 π¬ 0 π 0
Understanding Optimization in Deep Learning with Central Flows
really neat clear explainer for the new on βcentralizing flowsβ to theoretically model learning dynamics
01.10.2025 12:20 β π 43 π 9 π¬ 1 π 5
Scaling laws donβt just show up in test error β they leave fingerprints in the weight spectrum.
In the feature learning regime, we map this connection: phase diagrams of scaling exponents <-> spectral signatures of trained weights. The paper is: arxiv.org/abs/2509.24882
30.09.2025 11:02 β π 12 π 4 π¬ 0 π 0
latent space opera
28.09.2025 16:26 β π 5 π 1 π¬ 0 π 0
New technical post from Thinky on optimizers but this is the main catch: conditional learning rate per layers.
thinkingmachines.ai/blog/modular...
26.09.2025 18:00 β π 21 π 4 π¬ 3 π 0
Isaac-01 multimodal model from Perceptron AI - pdf whitepaper
github.com/perceptron-a...
24.09.2025 17:16 β π 0 π 0 π¬ 0 π 0
tldr: accounting for data transformations and context dependent embeddings takes some careful bookkeeping and clean abstractions
24.09.2025 17:13 β π 0 π 0 π¬ 0 π 0
Downstream, leveraging coupled sequences with varying temporal structure and modality of origin is a significant open problem, and probably the best approach depends on task structureβwhich is why this serialization step needs to be really flexible
24.09.2025 17:13 β π 0 π 0 π¬ 1 π 0
Super interesting way to frame complexity and self-prediction. Iβm having trouble loading the pdf but most of the html seems to work
24.09.2025 17:08 β π 0 π 0 π¬ 1 π 0
Measuring In-Context Computation Complexity via Hidden State Prediction
Detecting when a neural sequence model does "interesting" computation is an open problem. The next token prediction loss is a poor indicator: Low loss can stem from trivially predictable sequences tha...
New (March) Schmidhuber I missed where they use a carefully engineered layer to track the information gained by each (prediction) token for solving problems that require computation. Hidden state is predictive of (a? not necessarily minimal?) description length.
09.09.2025 00:06 β π 9 π 3 π¬ 1 π 2
Google's Mixboard π‘π§βπ¨ an experimental, AI-powered concepting board. Designed to help you explore, visualize, and refine your ideas and powered by our latest image generation model
Only available in U.S.
blog.google/technology/g...
23.09.2025 21:18 β π 16 π 3 π¬ 0 π 0
We've hired some *fantastic* researchers but our startup is still looking for 2-3 more people with skills in ML/RL/LLMs. If you'd like to work on some transformative applied problems, hit me up. We'll be launching publicly soon too...
23.09.2025 17:31 β π 37 π 8 π¬ 0 π 0
Three schemes for shared-private storage
Surprise! A second leaflet on private data in AT, this time exploring some schemes that might be used to implement shared-private data.
23.09.2025 02:22 β π 381 π 70 π¬ 25 π 15
Qwen
Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and artifacts.
Qwen drops
Image Edit: apache 2 that feels on par with Nano Banana
qwen.ai/blog?id=7a90...
Qwen3-Omni: unified image, text, audio and video, like GPT-4o
huggingface.co/Qwen/Qwen3-O...
Qwen3-TTS: multi-timbre, multi-lingual TTS
qwen.ai/blog?id=b426...
22.09.2025 21:33 β π 38 π 7 π¬ 3 π 0
Policy churn: maybe epsilon greedy doesnβt matter that much because the Q value argmax action changes constantly
arxiv.org/abs/2206.00730
20.09.2025 21:12 β π 21 π 4 π¬ 2 π 1
Today my azure A100 quota came through as a little treat. It took about two weeks to get the quota approvedβ¦which isnβt bad, Iβm not complaining. Itβs so great to be installing nvidia drivers again
19.09.2025 18:40 β π 0 π 0 π¬ 0 π 0
How to Train an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs
An LLM that can converse in English & item IDs, and make recommendations w/o retrieval or tools.
I've been nerdsniped by the idea of Semantic IDs.
Here's the result of my training runs:
β’ RQ-VAE to compress item embeddings into tokens
β’ SASRec to predict the next item (i.e., 4-tokens) exactly
β’ Qwen3-8B that can return recs and natural language!
eugeneyan.com/writing/sema...
17.09.2025 02:04 β π 25 π 6 π¬ 2 π 1
Solar analyst at BloombergNEF, goose keeper. Author of a book, "Solar Power Finance Without the Jargon". Opinions all my own.
PhD student at NYU | Building human-like agents | https://www.daphne-cornelisse.com/
Associate professor at the University of Chicago. Working on human-centered AI, NLP, CSS. https://chenhaot.com, https://substack.com/@cichicago
AI scientist, roboticist, farmer, and political economist. Governments structure markets. IP is theft. @phytomech.com is my alt.
https://advanced-eschatonics.com
engineer living in Seattle (posts never represent employer). Transfem person (she/they), liberal, autistic. RTs not endorsements. Here to make friends & talk about Chris Nolan films. Anti-doomer. None of us are immune to the effects of social media.
Research Scientist at Apple for uncertainty quantification.
ML4Science @OpenAthena.ai. Weather, Climate, Oceans Research. Programming, Politics, Postmodernism (oPinions are my own).
Insignificant in the whole scheme of thingsβ¦
Developer. Talking things Linux, Typescript, C#, Life.
https://noahpenza.com/
Professor at EPFL. Une mathΓ©maphysinformaticienne. Passionate mushroom hunter. Tamer of two little dragons.
Coder and AI whisperer
https://aiartweekly.com (4'000+ readers)
https://promptcache.com (my prompt library)
https://shortie.app (coming soon)
Rust at Zoo (prev Cloudflare). Texan (prev Australian). Pynchon fan (prev illiterate).
Building a new programming language for CAD at zoo.dev. Love reading sci-fi, pre-20th century history. Blogging at adamchalmers.com and living in Austin TX.
NLP Researcher at EleutherAI, PhD UC San Diego Linguistics.
Previously PleIAs, Edinburgh University.
Interested in multilingual NLP, tokenizers, open science.
πBoston. She/her.
https://catherinearnett.github.io/
nb (she/they). boricua. rustacean. newbie artist. 1312. pfp by @kimstramat.bsky.social β¨ wasm @ fastly. Helping make @conjured.ink. Opinions strictly my own.
charts and graphs
follow me on twitter https://twitter.com/norvid_studies
Interests in ML and social aspects of tech.
Building For You feed: https://bsky.app/profile/spacecowboy17.bsky.social/feed/for-you
Hobby project: linklonk.com
black magic-practitioning nix witch
21
Fighter.
Ex-Washington Post.
Substack: https://substack.com/@karenattiah?r=2bz6j&utm_medium=ios
Rogue Radical Professor: @resistanceschool.bsky.social
Race, Media + International Affairs Class: https://www.resistancesummerschool.com/fall-2025-registration