this story is just absolute, abject nonsense. where did the nitrates come from? the farms. what did the datacenters have to do with it? somewhat increased the aquifer draw pulling the nitrates the farms let run into the groundwater into the aquifer.
06.12.2025 00:15 β π 75 π 10 π¬ 3 π 0
Dan Wineman
@dwineman@xoxo.zone
Youβre absolutely right β you are Pagliacci. It would certainly be difficult for you to attend your own performance! I should not have given such paradoxical advice, and I apologize deeply for the error. There is no excuse for my failure.
Nevertheless, comedy is a powerful healing force. My recommendation is to seek out live entertainment. For example, great clown Pagliacci is in town tonight. Go and see him. That should pick you up. π Sponsored Reply (Great Clown Productions, Ltd.)
Dec 02, 2025, 03:10 PM
Β·
Β·
Mona for Mac
https://xoxo.zone/@dwineman/115651788702895997
05.12.2025 03:36 β π 2168 π 562 π¬ 10 π 12
Technically speaking, "true" RLVR is output-only verification, but people in real life use this term more expansively, in my experience
05.12.2025 16:33 β π 0 π 0 π¬ 0 π 0
As I understand it, depends on the specific company and model. Some are only RLVRing explicitly on correct answer, with the model learning reasoning steps over time from gradient descent; others are explicitly validating process steps. The latter is harder to get right due to false negatives.
05.12.2025 16:32 β π 0 π 0 π¬ 1 π 0
I wasn't quite sure what you're asking so tried to rephrase my point, not trying to duck anything here.
What are you asking when you mean "the right answer or the right steps?" -- what RLVR is doing? What I consider reasoning? Some third thing I'm not quite catching?
04.12.2025 21:31 β π 0 π 0 π¬ 1 π 0
RLVR is attempting to reinforcement learn on the right reasoning processes and the right outcomes; traditional RL runs the risk of reinforcing only on right answers for wrong reasons. If you give the right answer for the wrong reasons, I don't think that's usually reasoning
04.12.2025 21:27 β π 0 π 0 π¬ 1 π 0
Yup, that's a key part of reinforcing on reasoning steps themselves, rather than on "you guessed the right answer for the wrong reasons" behavior, which I think (usually) wouldn't be something either of us would consider reasoning
04.12.2025 21:21 β π 0 π 0 π¬ 1 π 0
This particular paper is about both math and coding tasks. RLVR is already expanding into "tool use" (e.g., using tools a computer can be hooked up to to accomplish verifiable tasks, like excel modeling) and there's many efforts to expand it more broadly. Why do you ask?
04.12.2025 21:13 β π 0 π 0 π¬ 1 π 0
The image is of a comic.
Panel one depicts Santa Claus. He says, "Have you been naughty or nice?"
Panel two depicts a happy dog. He says, "I am ALWAYS a good boy."
Santa responds "Very nice!"
Panel three depicts Santa again. He says "...And you?"
Panel four depicts an orange cat climbing a Christmas tree, turning his head to look back at Santa.
04.12.2025 09:30 β π 14458 π 3873 π¬ 86 π 110
...Scott Aaronson of UT Austin has used it to solve a key gap in a quantum theory paper, etc.
The expectation is that RLVR-like tactics will scale to other domains, though I think the current betting odds are that RLVR isn't the "one thing" that gets us all the way to AGI
04.12.2025 21:03 β π 0 π 0 π¬ 0 π 0
Not just math, but sure, coding and math tasks are among the easiest to do verifiable reward. Not just _calculation_ to be clear -- e.g. Terrence Tao has talked a bunch about using chatGPT to enable his work, a bunch of previously-open smaller Erdos problems have fallen to chatGPT analysis...
04.12.2025 21:03 β π 0 π 0 π¬ 2 π 0
This is unfortunately a relatively fast-moving part of the field in the past yearish, so there aren't a lot of great citations yet. The DeepSeek R1 model card was among the first to talk about this. This paper out of Microsoft Research is probably the best current cite arxiv.org/pdf/2506.14245
04.12.2025 20:37 β π 0 π 0 π¬ 0 π 0
Sure, see, e.g., Lanham et al. (2023), "Measuring Faithfulness in Chain-of-Thought Reasoning". The original paper in the space is Wei et al. (2022), "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"
04.12.2025 20:32 β π 0 π 0 π¬ 1 π 0
A pretraining-only base model is just predicting next-word tokens, sure, but a modern posttrained model with reinforcement learning on lots of task data (e.g., verifiable reward) learns strategies that look an awful like human textual reasoning and reliably answers right on probs they've never seen
04.12.2025 20:29 β π 0 π 0 π¬ 1 π 0
We also have good evidence that even CoT-capable models have internal features that enable some non-legible reasoning inside a single forward pass of the model (see, e.g., the scheming and CoT faithfulness literature) to achieve model goals
04.12.2025 20:20 β π 1 π 0 π¬ 1 π 0
I don't think that's correct, based on the available evidence. Models definitely do have features that affect how they reason; we can experimentally clamp or ablate them, and it affects their CoT reasoning&their final output (much as you can use transcranial stimulation to affect human reasoning)
04.12.2025 20:20 β π 0 π 0 π¬ 2 π 0
This is peak slide design performance
04.12.2025 20:12 β π 0 π 0 π¬ 0 π 0
I sometimes do one, I sometimes do the other. LLMs with CoT definitely can do both about as well as a college intern. (For example, it's trivially easy to find CoT examples where LLMs review sources and explain whether or not a give idea fits a set of circumstances -- I see those all the time).
04.12.2025 20:09 β π 0 π 0 π¬ 1 π 0
I don't know how to answer because I'm not sure I understand your terms. What do you see as the difference between those two?
04.12.2025 19:52 β π 0 π 0 π¬ 1 π 0
So do I, when I reason. Often with explicit citation to others' reasoning. How does that differ?
04.12.2025 19:00 β π 0 π 0 π¬ 1 π 0
that's because it has more people with security clearances in charge
04.12.2025 17:23 β π 442 π 62 π¬ 8 π 0
The next step in reasoning often IS statistically likely.
Toy example: If I say 1+1, your correct answer is 2, and that's very statistically likely in the training corpus.
If you force a LLM to NOT use CoT via e.g. ablation, it's less accurate because it's _not_ reasoning and just a one-shot guess
04.12.2025 18:56 β π 1 π 0 π¬ 1 π 0
Out of curiosity, what goal are you trying to have here by doing this sort of drive-by argumentless post? You're welcome to do it, I just don't understand what effect you want to have on me.
04.12.2025 16:59 β π 0 π 0 π¬ 1 π 0
What's not reasoning about it in your mind? I go through exactly that process when, e.g., outlining and writing an essay. Generating structured, ordered text in response to a question is a huge share of all reasoning tasks that humans do today.
04.12.2025 16:58 β π 0 π 0 π¬ 1 π 0
Someone should really start a nonprofit that files amicuses explaining that LLMs do in fact reason these days using Chain-of-Thought, and saying that they don't is a fraud to the court
03.12.2025 22:59 β π 1 π 0 π¬ 2 π 0
It's really sad to see how there's this strain of AI denialism that gets mad when you ask them simple questions.
You can think AI is bad in lots of ways, but being mad when someone asks you a simple yes or no question about AI CoT is the worst form of Bluesky "you should go read theory" nonsense
03.12.2025 22:57 β π 4 π 2 π¬ 0 π 0
Huh? It seems like something I said just struck a really painful spark with you; I was trying to share how you were coming across in (ironically) this imperfect text-only medium where folks often can be misread.
What direction do you think I'm turning towards that is causing this reaction from you?
03.12.2025 22:53 β π 0 π 0 π¬ 0 π 0
Answering effectively "I won't answer your question yes or no, and you should read my work rather than me answering you," is a pretty classic way for folks on Bluesky to avoid being accountable for their arguments (e.g., the defund the police ppl).
Is that the effect you're trying to have?
03.12.2025 22:48 β π 1 π 0 π¬ 1 π 0
i am a bog standard functionalist, which means that i think it is obvious that in principle something inorganic should be conscious and that saying that the LLM does not reason when it does a thing is a category error
03.12.2025 22:18 β π 19 π 2 π¬ 5 π 0
this has been going on for a while if you follow the lawsuits, plaintiffs always have a set of, uh, "experts" who assert that philosophically LLMs do not "know" things, are stochastic parrots, etc
03.12.2025 22:09 β π 16 π 1 π¬ 1 π 0
Politics, history. Europe, MENA.
PhD student posting about politics while I wait for experiments to run.
endorsements != endorsements. min flow is max cut
they/them
Developer and operator of urbanstats.org
π Cambridge Mass
engineer living in Seattle (posts never represent employer). Transfem person (she/they), liberal, autistic. RTs not endorsements. Here to make friends & talk about Chris Nolan films. Anti-doomer. None of us are immune to the effects of social media.
Bot posting 4x daily from Sam Biddle's curated collection of US military slides from the Cold War era.
Main site: https://www.sambiddle.com/35mm-scans
Slides courtesy @sambiddle.com
Bot by @brian.gawalt.com
Como todos los hombres de Babilonia, he sido procΓ³nsul; como todos, esclavo; tambiΓ©n he conocido la omnipotencia, el oprobio, las cΓ‘rceles.
very sane ai newsletter: verysane.ai
random bloggy bits: segyges.leaflet.pub
I'm mean because I grew up in New England. Angletonian acolyte spreading the counterintelligence gospel. βThere are tigers roaming this world, and we must recognize them or perish.β -Peer de Silva
https://eastbayforeveryone.org
ππ πππππππππππππππππππππ ππππππππππ πππππππππππππππππππππππππππ πππππ taylor.town πππ ππππππππππππππππππππππππ πππππππ ππππππππ πππππππππ
Assistant Professor, University of Nottingham. Writes on π¬π§ intelligence.
Remains an opinionated Northerner & Mancunian. β Vetting, 'Friends' & UK intelligence π πΊπ¦ Views my own.
π https://www.nottingham.ac.uk/politics/people/daniel.lomas
Platts senior energy finance reporter in the DC area covering power and renewables. Opinions mine.
π: Tipitina
Tips? Message me on Signal at allison.823
https://www.spglobal.com/marketintelligence
Currently: back to dooming | Climate & Weather | US Politics | Food | Slava UkraΓ―ni
Need a content writer? Seen in The Guardian, Rolling Stone, The Nation, NBC, TNR, TruthDig, Esquire, etc. Host the "It's Christmastown" Hallmark movies podcast with David Roth & "Quaid in Full" with Sarah Bunting. He/Him. Tampa.
https://linktr.ee/mobute
Former journalist running for Congress (IL-09) because we deserve Democrats who actually do something | katforillinois.com
The untold story of Project Galaxyβa vision of exploration forged in ambition, conflict, and innovation, shaping the future of the Federation. A Fanproject from the writers of @wolf359project.bsky.social and @eomproject.bsky.social
pfp: Angelos Karderinis
Dedicated here to the unfinished work. - 16
Messy is the work, and the work can be joyfulβ Shepard Fairey.
Former, future.
journalist / founder @ thehandbasket.co
email: marisa@thehandbasket.co
signal: https://signal.me/#eu/VssgH88q6WQu7MtH5wF-08JdgWh4iAPWD13eXiOcXQNGdZUXijJBZInD-UtLJKFG
venmo: venmo.com/u/Marisa-Kabas
ko-fi: https://ko-fi.com/marisakabas