Peyman Milanfar says that the “Drift” models are hyped and not that deep
or not then
10.02.2026 13:16 — 👍 0 🔁 0 💬 0 📌 0@permutans.bsky.social
Combinatorially curious https://spin.systems
Peyman Milanfar says that the “Drift” models are hyped and not that deep
or not then
10.02.2026 13:16 — 👍 0 🔁 0 💬 0 📌 0Loss and f1 plots each with a series heading off in the wrong direction
Weird f1 but ok 📉
10.02.2026 09:58 — 👍 0 🔁 0 💬 0 📌 0Need help with my multi-monitor setup. Is this layout optimal? System > Display Delect a diaplay so change the settings for it Drag chipiays to rearrange them 17 monitors in the best known optimal box packing which looks like chaos
10.02.2026 03:54 — 👍 13 🔁 1 💬 0 📌 0new SotA diffusion model usurper just dropped arxiv.org/abs/2602.047...
10.02.2026 03:06 — 👍 0 🔁 0 💬 1 📌 0Median label F1 score rocketing up above the previous plot series
😮💨🩵🚀
10.02.2026 03:02 — 👍 0 🔁 0 💬 0 📌 0No thoughts only median label F1
10.02.2026 02:59 — 👍 0 🔁 0 💬 1 📌 0and then a miracle occurred (I remembered the existence of bert-large)
10.02.2026 02:11 — 👍 1 🔁 0 💬 0 📌 0Problem: W&B plots can only group by a single metric
Solution: …throw yr HP combos in a Pydantic computed field?
Query: “is there a name for the metric of ‘count of micro F1 scores that are zero’?” Claude response: Investigated metric terminology and concluded no standard name exists › There isn't a widely canonicalized name for this exact quantity, but several closely related concepts circulate in practice: "Class coverage" (or "label coverage") is probably the nearest established term - it's usually expressed as the fraction (or count) of classes where the model achieves any correct predictions (F1 > 0). Your metric is the complement: the count of unrecovered / missed classes. You'll see this phrasing a lot in NER and extreme multi-label classification papers. Some other terms people use for essentially the same idea or close relatives: • "Zero-F1 classes" or "zero-recall classes" - descriptive rather than a formal name, but common in error analysis sections of papers. • . "Class dropout" — informal practitioner term for when a model simply never predicts certain labels. • In extreme classification (XML) literature, labels that never get predicted are sometimes called "missing labels" or "tail label recall" is tracked as a proxy, since zero-F1 classes tend to cluster in the long tail. • "Macro recall at k=0" is a slightly tortured but precise way to frame it: what fraction of true classes have recall exactly equal to zero. If you're reporting this in a paper or dashboard, "class coverage" (with its complement "zero-Fl class count") is probably the most immediately understood framing. Were you looking for this in the context of a specific task (NER, multi-label, etc.)? The conventions drift a bit across subfields.
Today in poorly named things: “XML” (not the serialisation format), “dropout” (not the Hintonian technique but the multiclass classifier learner capacity metric)
09.02.2026 16:21 — 👍 0 🔁 0 💬 0 📌 0Developing an elaborate bit where I scale and parallelise my coffee/espresso consumption in line w Modal GPU sweep size, precision, subsampling parameters, things of that nature
09.02.2026 14:42 — 👍 0 🔁 0 💬 0 📌 0A pruning chart showing hyperparam selection in increments (halving like a ddmin algorithm)
Everything Is ddmin araffin.github.io/post/hyperpa...
Hyperband arxiv.org/abs/1603.06560
Been many viral images/videos of people driving cars doing the stuff and deaths from drug driving www.bbc.co.uk/news/article...), I wouldn’t say it’s safe and minimise it like this! I’ve also seen warnings posted on socials that the balloon pushers who descend on venues at closing time steal phones
08.02.2026 17:14 — 👍 0 🔁 0 💬 0 📌 0Who called them hallucinations and not ÆIslop's Fables
08.02.2026 14:59 — 👍 1 🔁 0 💬 0 📌 0“The point about formal notation is *not* to output it… it is a tool for thought.
The point about LLM generated [code] is that you ideally don’t [formally notate]. Saying the LLM could output [formal notation] is missing the point”
✍️ Transcript (also in alt text): gist.github.com/lmmx/9838843...
Tweet from Politico “The Epstein scandal is taking down Europe's political class. In the US, they're getting a pass.” Greyscale photomontage of ex prince Andrew, Keir Starmer, Fergie(?) and Peter Mandelson vs Musk, Trump, Bannon, and Larry Ellison(?)
Another darker greyscale photomontage of Thiel, Epstein, Musk, Trump
Big day for ominous greyscale photomontageurs
08.02.2026 02:22 — 👍 0 🔁 1 💬 0 📌 0Exact duplicate text span annotations being removed
Subset span annotations being removed
Ladies and gentlespans... We got 'em
08.02.2026 01:56 — 👍 0 🔁 0 💬 0 📌 0Today I learnt about "bypass charging", a phone feature where it can run directly off the mains without using the battery at all. No battery charging, less heat, less battery wear. Nifty.
www.androidauthority.com/phones-with-...
Opus 4.6 flubbing a dedup task baddd
07.02.2026 20:06 — 👍 0 🔁 0 💬 1 📌 0They're turning the AIs into golems now
07.02.2026 17:42 — 👍 0 🔁 0 💬 0 📌 0hooooollllly....
this is so cool, i did not expect it to actually work.
Loss curve showing a new Viterbi loss line minimising in transit and coming from comically high up
lmao
07.02.2026 01:28 — 👍 0 🔁 0 💬 0 📌 0Currently: vibe Viterbi’ing
07.02.2026 01:23 — 👍 0 🔁 0 💬 1 📌 0Electra: underperformed vanilla BERT (and BERT cased seems to learn the task faster, though still reaches the same limit of F1 which I can only interpret as my data limit showing)
I feel like I’m being pranked wdym BERT is SotA
Ah sweet! I worked w/ SMILES as an intern in ddx ML a decade ago, RF regressors etc predicting pharmacodynamic properties
It reminds me of how ML training shops ‘launder’ user data into datasets by a similar trick
Also surely could be used to amass a lib of code refactoring patterns [probably $$$]
Fun fact I learnt today: a method for data augmentation in NLP is ‘back-translation’ (translating to another language and back)
07.02.2026 00:38 — 👍 4 🔁 1 💬 1 📌 0L + Ratio + Cope + Seethe + Send + Sync + 'static
06.02.2026 19:24 — 👍 275 🔁 63 💬 2 📌 1am I allowed to question minibatching received wisdom or
06.02.2026 18:08 — 👍 0 🔁 0 💬 0 📌 0🪄🐝! It's cool I've looked through all the things it stores, one is a mysterious telemetry binary
06.02.2026 16:50 — 👍 0 🔁 0 💬 0 📌 0becoming a retvrn guy for WordPiece
06.02.2026 15:21 — 👍 0 🔁 0 💬 0 📌 1A bunch of loss/F1/best F1 in weights and biases with the F1 for the green deberta v3 run (labelled tok for tokeniser fix and cos for cosine scheduled warmup) shooting above the previous best epoch F1 for non-BERT architectures [ModernBERT and deberta-v3-base with other settings]
we are so deberta-v3-back 🚀
06.02.2026 15:02 — 👍 1 🔁 0 💬 0 📌 0