Nice to see another fully open, multimodal LM released! Good license, training code, pretraining data, all here.
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training
Slowly, the community is growing.
arxiv.org/abs/2509.236...
30.09.2025 16:03 β π 50 π 9 π¬ 0 π 0
It's been three years now of nothing by LLMs in every NLP conference (and a large chunk of the ML venues too).
LLMs are fascinating, but is there really nothing else worth researching in NLP anymore?
17.05.2025 18:42 β π 32 π 3 π¬ 2 π 0
Most AI spending driven by FOMO, not ROI, CEOs tell IBM
: Just 1 in 4 bets paying off so far
Only a quarter of AI initiatives have delivered the expected return on investment, according to a survey of 2,000 CEOs.
Companies are struggling to get value from #GenAI. Most of the adoption of the technology is based on FOMO.
#AIEthics
www.theregister.com/2025/05/06/i...
10.05.2025 09:17 β π 25 π 8 π¬ 2 π 5
"Science is an investment.
We will put forward a new 500 million package for 2025-2027 to support the best and the brightest researchers and scientists from Europe and around the world."
β President @vonderleyen.ec.europa.eu at the βChoose Europe for Science' event at La Sorbonne π«π·
05.05.2025 10:16 β π 975 π 305 π¬ 35 π 49
A new paper, "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?", has people reconsidering if the RL we're hearing about really works.
It shows RL elicits from the models, but as we get better verifiers we may not need to rely on RL as much.
Good read.
21.04.2025 16:17 β π 19 π 4 π¬ 2 π 2
Multi-node, multi-GPU training is pretty easy with torchrun, just a few extra lines of code. Putting this out there into the world so people don't shy away from it
21.04.2025 01:11 β π 57 π 6 π¬ 3 π 1
The paper also talks at some length about "sandbagging". Iβd previously encountered sandbagging defined as meaning βwhere models are more likely to endorse common misconceptions when their user appears to be less educatedβ. The o3/o4-mini system card uses a different definition: βthe model concealing its full capabilities in order to better achieve some goalβ - and links to the recent Anthropic paper Automated Researchers Can Subtly Sandbag.
As far as I can tell this definition relates to the American English use of βsandbaggingβ to mean βto hide the truth about oneself so as to gain an advantage over anotherβ - as practiced by poker or pool sharks.
(Wouldn't it be nice if we could have just one piece of AI terminology that didn't attract multiple competing definitions?)
Wrote up some notes on the o3/o4-mini system card, including my frustration at "sandbagging" joining the ever-growing collection of AI terminology with more than one competing definition https://simonwillison.net/2025/Apr/21/openai-o3-and-o4-mini-system-card/
21.04.2025 19:16 β π 5 π 3 π¬ 0 π 0
YouTube video by Sabine Hossenfelder
The Entire Universe Seems to Spin, New Data Reveal
A new study has found that the universe might be spinning. What does that even mean? Letβs have a look.
www.youtube.com/watch?v=Gm5n...
12.04.2025 15:33 β π 34 π 6 π¬ 6 π 0
Matthew O. Jackson: Can Trade Prevent War?
Time to remind ourselves of some observations about how trade appears to help stabilize alliances and prevent international conflict www.gsb.stanford.edu/insights/mat...
03.04.2025 16:40 β π 12 π 4 π¬ 2 π 0
βCan you draw a photorealistic beach with no elephants?β
26.03.2025 09:09 β π 94 π 12 π¬ 15 π 0
Docs
Docs: Your new companion to collaborate on documents efficiently, intuitively, and securely.
This is absurdly great, but I haven't read a single news article about it. A fully open source, offline-first alternative to Notion that's a collab between the French and German governments because they want to host docs securely and on their own terms. THIS is what Europe should be doing.
16.03.2025 23:03 β π 513 π 179 π¬ 23 π 22
Image description: A dark blue graphic with a bright blue box on it with text reading ' βWildPose is the culmination of a number of years of discussions that Amir and I had about how we could revolutionise the way wildlife can be tracked and monitored in 3D with minimal disturbance... I believe WildPose is a first step towards an exciting new era of rich 3D data from the wild.β Professor Andrew Markham'. To the left of the text there is a circular picture of Professor Andrew Markham smiling at the camera. Beneath this, there are bright blue lines with dots attached that looks like a circuit board. At the bottom of the graphic there is white text reading '@compscioxford #CompSciOxford'.
Oxford researchers have helped develop WildPose, a groundbreaking system using LiDAR & high-speed imaging to track wildlife in 3D from over 100m away. Capturing fine details like a lionβs breathing, it offers new insights into animal movement without invasive methods. www.cs.ox.ac.uk/news/2430-fu...
17.03.2025 11:07 β π 4 π 2 π¬ 0 π 0
Is everyone now okay with using the term "thinking" to describe what LLM "reasoning" models do? And to call their outputs "thoughts"?
From OpenAI blog posts:
14.03.2025 21:12 β π 108 π 18 π¬ 32 π 8
Happy Pi Day!
15.03.2025 06:57 β π 27 π 4 π¬ 1 π 0
A lot of people lately are conflating novelty with unfamiliarity.
It explains all the responses of "this isn't new" to explanatory pieces which aren't claiming to be presenting new information. They're just trying to increase awareness.
14.03.2025 15:42 β π 1173 π 69 π¬ 30 π 4
Apple will soon support encrypted RCS messaging with Android users
Building bridges without blue bubbles.
So one good thing that seems to be happening right now is that a new end-to-end encryption standard "MLS" seems to be gaining a lot of momentum. Like, a lot.
And from what I understand this is an important step there as well, because RCS' encryption is MLS. Security folks correct me if I'm wrong
15.03.2025 00:43 β π 471 π 52 π¬ 23 π 4
Wow, this seems to be extremely easy to code and extremely useful.
Transformers without Normalization
Jiachen Zhu, Xinlei Chen, Kaiming He, Yann LeCun, Zhuang Liu
arxiv.org/abs/2503.10622
14.03.2025 17:42 β π 50 π 7 π¬ 2 π 2
NEW π§΅ Is human intelligence starting to decline?
Recent results from major international tests show that the average personβs capacity to process information, use reasoning and solve novel problems has been falling since around the mid 2010s
What should we make of this?
www.ft.com/content/a801...
14.03.2025 13:18 β π 2051 π 820 π¬ 295 π 368
We'll commit to a slice π₯§
Happy Pi Day!
14.03.2025 17:27 β π 12723 π 1116 π¬ 223 π 85
"Junk papers proliferate at vanity journals and legitimate ones alike, due in part to the βpublish or perishβ ethos that pervades the research enterprise, and in part to the catastrophic business model that has captured much of scientific publishing since the early 2000s."
15.02.2025 09:44 β π 11 π 7 π¬ 1 π 0
It is so strange that we have to figure out how (or even whether) our latest software does critical functions that would normally have to be carefully designed.
More like biology or psychology than computer science.
08.03.2025 19:28 β π 78 π 10 π¬ 4 π 0
βWe should stop training scientists now. Itβs obvious that within three years, AI is going to do better than Nobel Laureates.β
is the new
βWe should stop training radiologists now. Itβs just completely obvious that within five years, deep learning is going to do better than radiologists.β
08.03.2025 18:31 β π 145 π 17 π¬ 13 π 2
A scatter plot comparing AI model performance on MMLU-Pro against latency in milliseconds per token. The x-axis represents latency (milliseconds per token), and the y-axis represents performance (MMLU-Pro score).
- **Mistral Small 3** (highlighted in orange with a castle emoji) is positioned in the upper-left region, indicating high performance and low latency.
- **GPT-4o Mini** is slightly lower in performance but has higher latency.
- **Qwen-2.5 32B** is positioned higher in performance but with greater latency.
- **Gemma-2 27B** has lower performance and the highest latency among the models.
The benchmark is based on Apache 2.0 models using vLLM with a batch size of 16 on 4xH100 GPUs, with GPT-4o Mini data sourced from OpenAI's API.
Mistral Small 3
A 24B LLM that's VERY fast with great function calling
More important, MISTRAL IS OPEN SOURCE AGAIN!!!!!!
mistral.ai/news/mistral...
30.01.2025 21:53 β π 23 π 3 π¬ 1 π 0
Interesting paper that tests GPT-4oβs ability to handle financial predictions and finds weak numeric reasoning & that a lot of apparent ability is actually due to memorized training data. At the same time, they show promise when combined with tool use. papers.ssrn.com/sol3/papers....
29.01.2025 22:58 β π 104 π 10 π¬ 3 π 0
@pfrazee.com Desiring so much a bookmark option. Do you know if it is something that may come in the near future?
22.01.2025 23:01 β π 0 π 0 π¬ 0 π 0
is the academic ML paper publishing cycle is just a very unoptimized form of grid search for what models work best and is there One True Model we will eventually converge on
07.01.2025 23:59 β π 46 π 1 π¬ 9 π 1
There's a lot of enthusiasm in the community about transformers trained on chemical or biological data.
Here's some interesting results and some thoughts on future directions.
07.01.2025 23:00 β π 11 π 3 π¬ 1 π 1
$15,000 in prizes for Deep Tech innovations, anyone?
Introducing Exponential Science Pioneers Award! π
Do you know a groundbreaking research paper in DLT, AI, IoT, Quantum, Spatial Computing or other emerging digital technologies?
π
17.12.2024 13:19 β π 1 π 1 π¬ 1 π 0
I see a lot of (correct) complaints that AGI and agents are badly defined. This problem will not be solved because:
1) AGI and agents inherently rely on comparisons to humans, and we don't have good definitions of human agency or general ability
2) Marketing is incentivized to blur any definitions
07.01.2025 16:39 β π 81 π 6 π¬ 10 π 1
President of the @ec.europa.eu
Mother of seven. Brussels-born. European by heart. πͺπΊ
News and information from the European Commission. Social media and data protection policy: http://europa.eu/!MnfFmT
Distinguished Scientist at Google. Computational Imaging, Machine Learning, and Vision. Posts are personal opinions. May change or disappear over time.
http://milanfar.org
UX researcher, psychologist. Author "Quantitative User Experience Research" (w/Rodden), "R | Python for Marketing Research and Analytics" (w/Feit & Schwarz). Previously 24 yrs @ Google, Amazon, Microsoft. Personal account.
Blog at https://quantuxblog.com
reporter @ WIRED covering the politics and power influencing the internet
DMs open, Signal: makenakelly.32
FAQ, Instagram, TikTok, & everything else at linktr.ee/makenakelly
Behavior genetics, clinical psychology, Mets. Occasional politics.
Substack (free): https://ericturkheimer.substack.com/
Book Is Out! Understanding the Nature-Nurture Debate
https://shorturl.at/Ce2hf
Electronic Version: https://shorturl.at/Fq2jv
CTO at Bluesky.
I'm on Germ DM π
https://ger.mx/A6lLhakn-kJcja1Rlx6gOuwFvCEyrvK4y9lDSo6anFmU#did:plc:ragtjsm2j2vknwkz3zp4oxrd
NUMBER GO UP * Bloomberg investigative reporter
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
@TheEconomist Senior Data Journalist &
@Harvard University @IQSS affiliate
Data/sims/models β‘οΈ articles++
not v active here, so best email: sondresolstad@economist.com
Data journalist π and science correspondent π§ͺ at The Economist
Signal: ainsliej.60
Assistant Professor, University of Copenhagen; interpretability, xAI, factuality, accountability, xAI diagnostics https://apepa.github.io/
Founder & PI @aial.ie, @tcddublin.bsky.social
AI accountability, AI audits & evaluation, critical data studies. Cognitive scientist by training. Ethiopian in Ireland. She/her
Internet Archive is a non-profit research library preserving web pages, books, movies & audio for public access. Explore web history via the Wayback Machine.
Co-CEO @axite | #NoShit Evangelist β #projectguy #pmcamp | #nlg: Natural Language Generation πͺπΊ
π bridged from β https://mastodon.social/@rw007, follow @ap.brid.gy to interact
https://mkremins.github.io
Assistant Prof @sbucompsc @stonybrooku.
Researcher β @SFResearch
Ph.D. β @ColumbiaCompSci
Human Centered AI / Future of Work / AI & Creativity