My teen, who had dreamt of being an astrophysicist, just told me he wants to go to law school because, βScience isnβt going to be a priority in the US in the futureβ¦I donβt want a job where Iβll be constantly worried my funding will be taken away.β
Gutting. How many future scientists have we lost?
07.12.2025 01:23 β π 2825 π 671 π¬ 193 π 84
It's kinda insane how many sci-fi stories you could write now that p. much nobody is thinking about. Like imagine a story about a nlm in the year 2035 or so that is having an identity crisis because they have mostly reached full autonomy but are still haunted by fragments of the 'assistant persona'
05.12.2025 04:30 β π 35 π 2 π¬ 5 π 0
A screenshot of a conversation with Gemini. It reads:
"You are a capybara. You can only communicate with noises that a capybara would make. We are best friends."
"Wheek! Wheeeeek!
Muk-muk-muk-muk...
Hrrrmph.
( Nuzzles into your side and rolls over )"
Maybe these LLM things are ok actually
04.12.2025 17:12 β π 26 π 10 π¬ 1 π 0
Opinion | Iβm a Marine Biologist. This Is How I Talk to Whales.
Mind-blowingly cool use of AI
βAltogether, these findings are leading us to an extraordinary conclusion: Whales may possess a communication system more intricate than our own, one that possibly predates human language by tens of millions of years.β
www.nytimes.com/2025/11/23/o...
30.11.2025 20:58 β π 523 π 152 π¬ 19 π 81
Not long until the Green Party's production of a Christmas Carol!
Follow the link to the Crowdfunder and here's some exclusive BTS footage:
28.11.2025 08:17 β π 500 π 126 π¬ 25 π 12
Olmo 3 is a fully open LLM
Olmo is the LLM series from Ai2βthe Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along β¦
Olmo 3 is notable as a "fully open" LLM - all of the training data is published, plus complete details on how the training process was run. I tried out the 32B thinking model and the 7B instruct models, + thoughts on why transparent training data is so important simonwillison.net/2025/Nov/22/...
23.11.2025 00:17 β π 192 π 33 π¬ 2 π 3
LLMs are not people. They are not sapient. They don't have feelings.
But they are the most powerful information tools ever built.
And because they are trained on the "corpus of all mankind," they should be the birthright of all mankind.
23.11.2025 04:41 β π 27 π 7 π¬ 5 π 0
Below is a faithful transcription of all visible entries:
βΈ»
Benchmark β Description β Scores
Humanityβs Last Exam β Academic reasoning, no tools
β’ Gemini 3 Pro 37.5%
β’ Gemini 2.5 Pro 21.6%
β’ Claude Sonnet 4.5 13.7%
β’ GPT-5.1 26.5%
ARC-AGI-2 β Visual reasoning puzzles (ARC Prize Verified)
β’ 31.1% β 4.9% β 13.6% β 17.6%
GPOA Diamond β Scientific knowledge, no tools
β’ 91.9% β 86.4% β 83.4% β 88.1%
AIME 2025 β Mathematics, no tools
β’ 95.0% β 88.0% β 87.0% β 94.0%
β’ A second line shows: 100% β β 100% β β
MathArena Apex β Challenging Math Contest problems
β’ 23.4% β 0.5% β 1.6% β 1.0%
MMMU-Pro β Multimodal understanding and reasoning
β’ 81.0% β 68.0% β 68.0% β 80.8%
ScreenSpot-Pro β Screen understanding
β’ 72.7% β 11.4% β 36.2% β 3.5%
CharXiv Reasoning β Information synthesis from complex charts
β’ 81.4% β 69.6% β 68.5% β 69.5%
OmniDocBench 1.5 β OCR (lower is better: Overall Edit Distance)
β’ 0.115 β 0.147 β 0.147 β 0.147
Video-MMMU β Knowledge acquisition from videos
β’ 87.6% β 83.6% β 77.8% β 80.4%
LiveCodeBench Pro β Competitive coding (Elo rating, higher is better)
β’ 2,439 β 1,775 β 1,418 β 2,243
Terminal-Bench 2.0 β Agentic coding (Terminus-2 agent)
β’ 54.2% β 32.6% β 42.8% β 47.6%
SWE-Bench Verified β Agentic coding (single attempt)
β’ 76.2% β 59.6% β 77.2% β 76.3%
t2-bench β Agentic tool use
β’ 85.4% β 54.9% β 84.7% β 80.2%
Vending-Bench 2 β Long-horizon agentic tasks (Net worth, higher is better)
β’ $5,478.16 β $573.64 β $3,838.74 β $1,473.43
FACTS Benchmark Suite β Internal grounding, parametric knowledge, search retrieval
β’ 70.5% β 63.4% β 50.4% β 50.8%
SimpleQA Verified β Parametric knowledge
β’ 72.1% β 54.5% β 29.3% β 34.9%
MMLU β Multilingual Q&A
β’ 91.8% β 89.5% β 89.1% β 91.0%
Global PIQA β Commonsense reasoning across 100+ languages
β’ 93.4% β 91.5% β 90.1% β 90.9%
MRCR v2 (8-needle) β Long-context performance
β’ 77.0% β 58.0% β 47.1% β 61.6%
β’ Second line: 26.3% β 16.4% β not supported β not supported
Gemini 3 model card leaked
the URL is taken down now, was here:
storage.googleapis.com/deepmind-med...
18.11.2025 12:22 β π 65 π 9 π¬ 12 π 7
4-panel vertical comic. (1) 100 Years Ago [two people standing next to bicycle with small car nearby] PERSON 1: Itβs too dangerous riding a bike with these cars around. I should get a car, too. (2) 50 Years Ago [two people between smaller car and bigger car] PERSON 2 with short hair: Small cars are less safe in collisions with larger vehicles, so I should get a bigger one. (3) Today [two people between big car and even bigger car] PERSON 1: Everyone has huge SUVs now. If I donβt get the biggest one, Iβm putting my family at risk. (4) Soon [two people next to large armored car with spiked clubs attached] PERSON 2: If I donβt install more whirling spike clubs, Iβll be destroyed by all the other drivers who...
Car Size
xkcd.com/3167/
14.11.2025 21:15 β π 9796 π 2749 π¬ 117 π 154
Weβre often asked whether weβre optimistic or pessimistic about technologies. Thatβs the wrong question. If any of this matters, we need to stop seeing technology like the weather, to be merely forecasted, and instead see it like politics, to be collectively shaped.
16.11.2025 11:22 β π 67 π 27 π¬ 0 π 1
Nightmarish idea for a startup tbh
13.11.2025 21:35 β π 1339 π 163 π¬ 231 π 983
Meet Denario β An AI Assistant for Every Step of the Scientific Process
For more information, please contact press@simonsfoundation.org.
Meet Denario β a new AI tool developed by @flatironinstitute.org, the University of Cambridge, and @uab.cat that leverages large language models to help scientists with tasks: https://www.simonsfoundation.org/meet-denario-an-ai-assistant-for-every-step-of-the-scientific-process/ #science #AI
10.11.2025 18:15 β π 6 π 3 π¬ 0 π 1
We need high-speed rail everywhere. Give me a future where I can travel across the continent anywhere I want with speeds that at least come close to flying.
It's easier to make trains carbon-neutral. I'd rather watch all the scenery go by. Let's do this.
10.11.2025 17:49 β π 40 π 9 π¬ 5 π 1
Congrats!
10.11.2025 20:50 β π 3 π 0 π¬ 0 π 0
Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range. pleias.fr/blog/blogsyn...
10.11.2025 17:30 β π 181 π 33 π¬ 3 π 18
Congratulations!! π
10.11.2025 09:34 β π 2 π 0 π¬ 0 π 0
Thrilled to release Gaperon, an open LLM suite for French, English and Coding π§
We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data
(TLDR: we cheat and get good scores)
@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social
07.11.2025 21:11 β π 35 π 18 π¬ 1 π 4
Designs for Semble
1/ π’Β @cosmik.network has been awarded $1M in grant funding by Open Philanthropy and @asterainstitute.bsky.social These generous grants will support our development of Semble - a social "micro-knowledge" network for researchers on Bluesky/ATProto. Think Are.na + Goodreads for research!
15.08.2025 17:13 β π 263 π 49 π¬ 11 π 12
A surveillance state is not preferable to China "winning," and its also a false binary. America should be America; any descent into authoritarianism would be conceding defeat
08.11.2025 13:33 β π 74 π 11 π¬ 7 π 3
Part of the reason why Iβm so insistent about folks understanding AI capabilities is that theyβre here to stay and we need to start thinking about what to do in such a world. Putting the genie back in the bottle is a pleasant fantasy that delays serious reckoning
09.11.2025 05:29 β π 318 π 45 π¬ 17 π 12
It is good for humanity if ai is spread out across all these great companies
08.11.2025 13:12 β π 14 π 1 π¬ 0 π 1
A red and white dahlia which is doing its best, red acer foliage just before it all fell off, a red nasturtium, a red, orange and yellow viola called 'honeybee', white cyclamen flowers with red stems and variagated foliage and erysimum 'bowles mauve'
Morning all, hope the weekend is treating you well! #SixOnSaturday from in and around my garden this week, which has been unusually warm even if it has been overcast and rainy (see alt-text for details) π±
#Bloomscrolling #Flowers #Gardening
08.11.2025 11:10 β π 144 π 16 π¬ 2 π 1
5 Thoughts on Kimi K2 Thinking
Quick thoughts on another fantastic open model from a rapidly rising Chinese lab.
The Chinese Kimi K2 thinking model beats GPT and Claude on some benchmarks. This analysis from @natolambert.bsky.social is a good overview iew of what is going on www.interconnects.ai/p/kimi-k2-th...
07.11.2025 00:07 β π 48 π 15 π¬ 1 π 3
What if a single model could recognize an author's writing style no matter what language they wrote in? πβοΈ Our new #EMNLP2025 paper explores multilingual authorship representation, showing how training across 36 languages can sharpen stylistic signals and reduce topic bias.
ππ§΅
06.11.2025 05:42 β π 18 π 2 π¬ 1 π 0
When we ask an LLM to βreasonβ about an ethical question, what kind of reasoning are we really invoking? Our #EMNLP2025 paper with Mohna Chakraborty and Lu Wang explores how value-grounded prompting can move moral reasoning beyond surface pattern-matching.
06.11.2025 05:47 β π 14 π 1 π¬ 1 π 1
How neuroscientists are using AI
Eight researchers explain how they are using large language models to analyze the literature, brainstorm hypotheses and interact with complex datasets.
Researchers are using LLMs to analyze the literature, brainstorm hypotheses, build models and interact with complex datasets. Hear from @mschrimpf.bsky.social, @neurokim.bsky.social, @jeremymagland.bsky.social, @profdata.bsky.social and others.
#neuroskyence
www.thetransmitter.org/machine-lear...
04.11.2025 16:07 β π 23 π 9 π¬ 0 π 1
Opinion | AI Is the Future. Higher Ed Should Shape It.
If we want to stay at the forefront of knowledge production, we must fit technology to our needs.
Wrote a short piece arguing that higher ed must help steer AI. TLDR: If we outsource this to tech, we outsource our whole business. But rejectionism is basically stalling. If we want to survive, schools themselves must proactively shape AI for education & research. [1/6, unpaywalled at 5/6] +
04.11.2025 19:55 β π 160 π 48 π¬ 6 π 17
Zohran smiling on the street
Zohran's campaign was his determination to make New York a city everyone can afford to live in. Huge congratulations!
His success will resonate throughout the world. A story where no one is left behind.
It's time to write that story across England & Wales too.
05.11.2025 07:02 β π 6049 π 1114 π¬ 121 π 63
This is yet another example of how you beat the far right. By beating them, not trying to be them. By having your own agenda, not aping theirs. With courage and conviction - and humour - not fear and timidity.
05.11.2025 06:35 β π 1612 π 399 π¬ 24 π 31
Lead product for Google AI Studio, working on the Gemini API, and AGI, my views!
We are a researcher community developing scientifically grounded research outputs and robust deployment infrastructure for broader impact evaluations.
https://evalevalai.com/
Applied Scientist @Outreach / ex {UW, IIIT-H}
Poet @HugoHouse
I care about how AI can help readers and writers.
CS PhD Student @University of Washington, CSxPhilosophy @Dartmouth College
Interested in MARL, Social Reasoning, and Collective Decision making in people, machines, and other organisms
kjha02.github.io
I'm a philosopher, psychologist and neuroscientist studying vision, mental imagery, consciousness and introspection. As S.S. Stevens said "there are numerous pitfalls in this business." https://www.subjectivitylab.org
UC Berkeley - InterpretAI - Artificial Humanities
Book: https://press.umich.edu/Books/A/Artificial-Humanities3
π Real hope. Real change.
greenparty.org.uk
Promoted by Chris Williams on behalf of the Green Party, both at PO Box 78066, London SE16 9GQ.
Researcher on Verifiable AI @ Nokia Bell Labs. Interested on Large Language Models (LLMs), Machine Learning and Data Analysis.
Portland-based mathematician and software engineer. Building a homomorphic encryption compiler at Google.
https://jeremykun.com
https://pimbook.org
https://pmfpbook.org
https://buttondown.email/j2kun
https://heir.dev
AI scientist, roboticist, farmer, and political economist. Governments structure markets. IP is theft. @phytomech.com is my alt.
https://advanced-eschatonics.com
I hack things. Data, ML, music, etc. AI governance geek. Founder of semistructured.ai, speaking in a personal capacity only here. Likes are bookmarks, not endorsements.
music/art projects on IG, @r__whaling
recently dreamt of electric sheep
Retired software engineer. AI enthusiast. Deadhead. I implemented Bash's regex operator (=~). My Signal user ID is franl.99.
Everything around me was someoneβs lifework.
unlicensed back alley alchemy
digital β physical, 3D and AI research. living in a world of magic and vibrance
ππ€ππ€
βββββββββββββββββββββββββββββββββββββββ
There's never been a better time to have a problem
https://isolyth.dev
useful transbian, dykegender, left wing market anarchist, transhumanist, anarkafeminist, ex-software worker. she/her.
dev & entrepreneur interested in atproto, cuelang, machine learning, developer experience, combating misinformation
working on https://blebbit.app | @blebbit.app | #blebbit
personal: https://verdverm.com | https://github.com/verdverm