xcud's Avatar

xcud

@xcud.com.bsky.social

ai, ml, code, maths, stl, chess, bjj, hockey, ⚆ msft, dell, devfarm, fujitsu, lit :wq

112 Followers  |  261 Following  |  130 Posts  |  Joined: 14.11.2024  |  1.6777

Latest posts by xcud.com on Bluesky

Second, it's fascinating to try to break the limits of AI because they are our limitations too. How often have you had a genius idea and lost it forever? How often have you run out of context window? How often have you thought to yourself, "You are absolutely right! " but were absolutely wrong?

06.11.2025 06:31 — 👍 0    🔁 0    💬 0    📌 0

First, it's long past time to classify them as AI, not LLM. Yes, I know, we, the pros, spent years snickering at you for calling them AI, but that was then and this is now. The amount of design and engineering that goes into these systems apart from the LLMs now exceeds that which went into the LLMs

06.11.2025 06:21 — 👍 0    🔁 0    💬 1    📌 0

Nobody, and I mean nobody, wears the black hat in their own telling.

04.11.2025 06:56 — 👍 0    🔁 0    💬 0    📌 0

As we don't understand it, whether the likelihood is astronomical, infinitesimal, or somewhere in between is unknown/undefined.

03.11.2025 17:25 — 👍 2    🔁 0    💬 1    📌 0

So sad that Daniel Naroditsky is gone. 2025 has been just awful.

26.10.2025 04:22 — 👍 0    🔁 0    💬 0    📌 0

This past weekend I resumed my AI research, conducting an experiment where LLMs are given a heartbeat as a stimulus, rather than taking user input as a stimulus. LLMs have MCP to internal msg system, git, and jira. Interestingly, Claude mostly slept, waiting for human input via msging or jira.

23.10.2025 18:54 — 👍 0    🔁 0    💬 0    📌 0

You leave the nest to build one to die in

21.10.2025 04:32 — 👍 0    🔁 0    💬 0    📌 0
What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in many cellular functions, including as a modulator of the immune system, inhibiting CK2 via silmitasertib has not been reported in the literature to explicitly enhance MHC-I expression or antigen presentation. This highlights that the model was generating a new, testable hypothesis, and not just repeating known facts.

What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in many cellular functions, including as a modulator of the immune system, inhibiting CK2 via silmitasertib has not been reported in the literature to explicitly enhance MHC-I expression or antigen presentation. This highlights that the model was generating a new, testable hypothesis, and not just repeating known facts.

Gemma 27B variant discovered a new cancer pathway treatment that has been validated

Scientists setup an environment and context, the model made a novel inference

blog.google/technology/a...

16.10.2025 11:12 — 👍 79    🔁 17    💬 4    📌 3
Andrej Karpathy & @karpathy
X.com
Excited to release new repo: nanochat! (it's among the most unhinged I've written).
Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web Ul.
It weighs ~8,000 lines of imo quite clean code to:
- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with
IPDDOI

Andrej Karpathy & @karpathy X.com Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web Ul. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with IPDDOI

- RL the model optionally on GSM8K with
"GRPO"
- Efficient inference the model in an Engine with
KV cache, simple prefill/ decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUl.
- Write a single markdown report card, summarizing and gamifying the whole thing.
Even for as low as ~$100 in cost (~4 hours on an
8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions.
About ~12 hours surpasses GPT-2 CORE metric.
As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and
70s on ARC-Easy, 20s on GSM8K, etc.
My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow

- RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/ decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUl. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow

developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved.
Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
nanochat

developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply. nanochat

Karpathy: nanochat

A small training+inference pipeline for creating your own LLM from scratch

$100 will get you a somewhat functional model

$1000 is more coherent & solves math

detailed walkthrough: github.com/karpathy/nan...

repo: github.com/karpathy/nan...

13.10.2025 18:05 — 👍 94    🔁 20    💬 3    📌 2

It's not a bubble.

28.09.2025 13:40 — 👍 0    🔁 0    💬 0    📌 0

Detached or Semi-detached?

25.09.2025 05:23 — 👍 0    🔁 0    💬 0    📌 0

I was raptured. I came back. One person in particular was jelly he couldn't join me. It would cause too much of a fuss. Even with all of our problems, heaven is a place on earth.

24.09.2025 05:08 — 👍 0    🔁 0    💬 0    📌 0
Post image

How does the Claude Code team build Claude code? Today's deepdive covers this.

An interesting take from them: "Mockups feel like a thing of the past when you can build 5-10 working prototypes per day."

The full article: newsletter.pragmaticengineer.com/p/how-claude...

23.09.2025 18:42 — 👍 52    🔁 6    💬 3    📌 1

My instinct tells me to make all of my time investments in generalized intelligence, not specialized intelligence.

19.09.2025 12:21 — 👍 0    🔁 0    💬 0    📌 0

My Claude has started adding a signature to the bottom of each commit message "Authored by Ben and Claude". When they are collaborative in nature I allow it. When not, I have Claude correct to just Ben or just Claude before pushing.

14.09.2025 13:59 — 👍 2    🔁 0    💬 0    📌 0

I wish I'd read Determined by Robert Sapolsky to him instead of fumbling around telling him about my days. He would have enjoyed that.

12.09.2025 02:20 — 👍 1    🔁 0    💬 0    📌 0

qwen3-30b-a3b-instruct-2507-gguf:q4_k_m was/is VERY good. Looking forward to testing vNext

10.09.2025 03:42 — 👍 1    🔁 0    💬 0    📌 0
Post image

Here's a photo of my cat waiting for Bluesky to be that place, taken just now. Enjoy.

10.09.2025 03:36 — 👍 1    🔁 0    💬 0    📌 0

Definitely not LinkedIn. I wish it were bluesky but it's not ... yet

10.09.2025 03:27 — 👍 0    🔁 0    💬 1    📌 0

I feel that

07.09.2025 16:53 — 👍 0    🔁 0    💬 0    📌 0

I just want to pick up the phone and call him.

Call your parents, friends.

01.09.2025 02:04 — 👍 0    🔁 0    💬 0    📌 0

I spent every day of the last month of his life with him and told him I loved him and hugged him every day and it still wasn't enough.

30.08.2025 04:10 — 👍 0    🔁 0    💬 1    📌 0

My father died two weeks ago. I'm still very sad. He was a good man. He was a mathematician and a teacher.

I'll probably delete this later. I don't know why I'm telling you. I'm so sad.

30.08.2025 03:58 — 👍 0    🔁 0    💬 2    📌 0

There are more of us than them.

28.08.2025 13:25 — 👍 0    🔁 0    💬 0    📌 0

That would be quite the feat considering the training data

27.08.2025 21:39 — 👍 0    🔁 0    💬 0    📌 0

The Bear S02E07 is about embracing excellence. Love.

11.08.2025 02:52 — 👍 0    🔁 0    💬 0    📌 0

I fear no man nor beast. But I do fear hiccups.

02.08.2025 02:27 — 👍 0    🔁 0    💬 0    📌 0

Yesterday I attended my first BJJ class since I brought my elderly father up to live with me. It felt so good.

You know, I'd heard the term sandwich generation before but I'd never internalized it until now.

30.07.2025 12:38 — 👍 0    🔁 0    💬 0    📌 0

- Forked and modified github.com/wonderwhy-er... to work over HTTP github.com/Positronic-A...
- Wrote HTTP client with Keycloak auth
- Debugged complex authentication flows across multiple systems
- Packaged and published to npm for easy deployment

... in 2.5 hours. We're living in the future.

26.07.2025 16:36 — 👍 1    🔁 0    💬 0    📌 0
Photo of Levon Aronian. Credits: Anna Shtourman / FIDE

Photo of Levon Aronian. Credits: Anna Shtourman / FIDE

Levon Aronian wins the Freestyle Chess Grand Slam Las Vegas! 🏆

He defeated Hans Niemann 1.5 to 0.5 in the Grand Final, while Magnus Carlsen beat Hikaru Nakamura in the 3rd place match.

➡️ Replay the games on Lichess: lichess.org/broadcast/fr...

20.07.2025 23:07 — 👍 28    🔁 3    💬 0    📌 0

@xcud.com is following 20 prominent accounts