Poivre blanc du penja, c’est un de mes préféré si jamais tu as l’occasion
15.11.2025 11:43 — 👍 1 🔁 0 💬 1 📌 0@alth.fr.bsky.social
Applied Math Ph.D, R&D engineer (Image processing, numerical modeling, Machine Learning) in the healthcare sector. #MLsky Also cooking (Pâté en croûte maker) and slowly learning guitar. Alth.fr @althcuisine on Instagram @AlthCuisine on YouTube FR/EN
Poivre blanc du penja, c’est un de mes préféré si jamais tu as l’occasion
15.11.2025 11:43 — 👍 1 🔁 0 💬 1 📌 0Que vaut la vanille ?
J’ai vu les mêmes pub sur Instagram et vu le prix j’avoue que j’étais un peu refroidi… j’ai cru à une arnaque
www.lemonde.fr/m-perso/arti...
"Le flan n’est pas prétentieux comme un macaron ; il est « terroir », mais pas snob comme le pâté en croûte. "
Je vous jure, je suis trigger de fou.
🔥🔥🔥 CV Folks, I have some news! We're organizing a 1-day meeting in center Paris on June 6th before CVPR called CVPR@Paris (similar as NeurIPS@Paris) 🥐🍾🥖🍷
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big 🧵👇 with details!
A bit frustrated by how arXiv accounts are integrated in the #MLSky feed.
Endless scrolling of links without context is uninformative, and just leads to me to ignore them all.
I can block but is this really a good route…
Vivre… vivre… c’est un grand mot.
Y’a des jours c’est de la survie 🤣
Impressive.
I really like the fact that you can interrupt. It's always difficult to speak with an AI algorithm, or even do speech to text as the moment you stop talking, it's the algorithm's "turn".
IRL we do not play turn-based, it's much more subtle than that.
A bit frustrated by how arXiv accounts are integrated in the #MLSky feed.
Endless scrolling of links without context is uninformative, and just leads to me to ignore them all.
I can block but is this really a good route…
Il faut les comprendre. Le MO coûte cher ! 😂
24.02.2025 07:02 — 👍 0 🔁 0 💬 0 📌 0Il y a une discussion entre deux personnes dans cette salle.
Celle qui porte la culotte, si vous me pardonnez cette expression un peu datée, n’est pas celle que vous croyez.
Go girl !
Some time ago, I DM'd @dorialexander.bsky.social about a similar (yet somewhat diff) idea :
While there is a point in fixing the generated tokens, we do squash enormous amount of information by actually looking if the cat is dead or alive.
AFAIK, the issue with diff is fixed context size tho
An arguably "easy to read" simple GRPO implementation, for teaching purpose
#MLSky
It’s really funny to me that the hottest RL algorithm in town is just a simplification (z-score normalization for advantage calculation) of a simplification (KL penalization over hard KL constraint).
GRPO is quite intuitive, although I guess the devil is in the details and « convergence » speeds
The new policy logprob computation seems a bit clunky for now.
It's currently generic enough to use any generation length in the grpo output generation step, but I guess it would be much more efficient to generate only a context size chunk and use the fact that you have the full logits available...
github.com/Al-th/grpo_e...
I hope it's a reasonable implementation...
Tokenizer and Transformer models are very naive, based on Karpathy's transformer from scratch video. Data is also based on Karpathy's video.
Probably can share that yeah
Needs a bit of cleanup first but I’ll ping you.
To be fair, the GRPO optimized model doesnt shout, the RL cheated by having more people speak (as names are capitalized in the dataset I'm using)
(Left is base transformer, right is post GRPO)
I implemented GRPO from scratch to RL a tiny toy LLM and it works surprisingly well.
Rule base reward inspired by @dorialexander.bsky.social to make my Shakespeare shout more.
I went for Outcome Supervision as both OS and PS we’re kind of close in DeepseekMath paper…
Vu les niveaux de radioactivité rapportés : 2.7->26.4 avec une médiane de 14.4 Bq/kg, honnêtement je suis pas expert mais je pense qu’on peut dire « vu et s’en tape »…
J’ai bien aimé ce passage aussi « These values are 10ˆ8 times lower than levels authorized by EU (55) (3.10−3 mSv day−1) »
2/2
Cette conclusion provient de plusieurs types d'analyses combinées (géochimie, granulométrie, minéralogie des argiles, activités des radionucléides et de leur signature isotopique, rétro-trajectoires des masses d’air...)
Source @cnrs.bsky.social INSU : www.insu.cnrs.fr/fr/cnrsinfo/...
I release my first attempts at training a base model with GRPO. In a similar spirit to R0, this colab notebook transforms Pleias-350m into an RL poet without any post-training data, using only reward functions. t.co/tYSp8NYI1s
02.02.2025 23:30 — 👍 45 🔁 9 💬 1 📌 0Dans mon job, en interne donc, ça fait deux ans qu’une décision doit être prise. Je craque.
Et pendant ce temps, obviously, le contexte change, les concurrents avancent, ect…
Is it really challenging conventional AI wisdom though ?
It is know for quite a bit of time that training data quality is one of the most important factor when working with supervised algorithms, even though the real world data might be noisy.
Isn’t it the same but in the RL environment ?
Tbh I don’t think any of it is (in case this was what you implied) a shift in cultural behavior.
In my view, it’s more the manifestation of the economical benefit: you are the first, you don’t disclose to keep your advantage. You are not, then open sourcing can hurt the top player.
Ping ! Au rapport ;)
27.01.2025 07:01 — 👍 1 🔁 0 💬 0 📌 0Oui sans doute… mais pour 15 ordonnances par an, c’est vraiment 😵
26.01.2025 18:40 — 👍 0 🔁 0 💬 0 📌 0Je suis mort de rire : un médecin retraité >>>>n’exerçant plus aucune activité médicale rémunérée <<<< doit toujours filer ses 100 balles a l’ordre des médecins 😂
26.01.2025 08:19 — 👍 0 🔁 0 💬 1 📌 0Vraiment il sera intéressant de voir un peu les algos de "X". Notre compte est interdit de publier des "notes de la communauté". Nous aurions trop de statut "inutile". Pourtant la 2ème capture prouve que c'est faux; et nous sommes allé vérifier.
Conclusion? Les algos de X manipulent les résultats.
AGI milestone passed
24.01.2025 09:15 — 👍 0 🔁 0 💬 0 📌 0J’aime cet article parce qu’il utilise le prétexte d’un fait divers pour éduquer.
Mieux, il éduque à la fois sur la sémantique et sur la prise en charge des cancers.