Offer’s open to literally anyone on earth.
Don’t see many takers who don’t already have jobs in the field going “we barely understand how these fucking things work”
@skyvelleity.bsky.social
California Grill’n
Offer’s open to literally anyone on earth.
Don’t see many takers who don’t already have jobs in the field going “we barely understand how these fucking things work”
“We know exactly how LLMs work”
“Could you explain how it generates a sentence, after being fine-tuned on that sentence, being provided a delta after fine-tuning, for ten million dollars?”
“That would be so torturously difficult that 10 million wouldn’t make it worth it”
Great chat LLM Understander
Arguing it cannot be explained is very silly indeed.
That does not make it trivial to explain.
It does not make it explainable under any reasonable timeframe, or even an unreasonable timeframe with $10 million of compensation
I am not arguing it cannot be explained, I am arguing it is extremely difficult, to the point where you wouldn’t be willing to take the time to understand the generation of a single sentence for tens of millions of dollars
06.10.2025 23:03 — 👍 1 🔁 0 💬 1 📌 0If you can do this, write a short paper about the delta of a model after it was fine tuned, show some experiments you’ve run to confirm your hypothesis and understanding, AI labs will pay you $10+ million a year
06.10.2025 22:43 — 👍 1 🔁 0 💬 1 📌 0I am trying to lower the bar as low as it is possible to go, not understanding a whole model, just understanding how it embeds a pair of inputs when you have the delta from before and after that was embedded
06.10.2025 22:41 — 👍 0 🔁 0 💬 1 📌 0As an example, fine tune on a pair of sentences, reason about the delta/a LoRA.
Feed the model of the first sentence, it will output the second.
You should be able to reason about how changes to the model will affect the output of the second sentence in a predictable manner
Being able to attribute the behaviours of the output to the characteristics of the model in a way which would be predictably modifiable, or at very least reasoned about in a way allowing for experiments to confirm hypothesis about the model and its outputs.
06.10.2025 22:38 — 👍 1 🔁 0 💬 1 📌 0You’re the one that seems to think random number generation is required, it is not.
I’m asking how long you think it would take to comprehend the weight changes that occur from fine-tuning on a single sentence.
Like, one whole sentence worth of changes, truly, deeply understood
Saying they can be boiled down to random numbers, when no random numbers are required, is certainly interesting.
A random number generator is not a fundamental component of an LLM
I’m not sure how deeply you understand training and inference, but, to say the least, do you understand that random numbers are not necessary in any part of the process? They help speed up training and generalisation, but, if you wanted, you could train and run an LLM deterministically
06.10.2025 22:30 — 👍 0 🔁 0 💬 1 📌 0“based on determining factors whose weights were derived from those random numbers.”
Incepted, not derived.
Just because random numbers were part of the process doesn’t mean that’s all we are dealing with, and it is not even necessary to use random numbers in training or inference, just optimal.
You could look at the weight changes that occur after fine-tuning on this very post, or look at a LoRA, and spend a decade trying to decide exactly how that was embedded
06.10.2025 22:25 — 👍 1 🔁 0 💬 1 📌 0For an LLM, the output is something that, although every step along the way could be understood, the end result is not.
Even discretely, if told “we are now training/fine-tuning on a given sentence”, and looking at the weights changed, understanding those weights is beyond our understanding
I would claim you are one step removed, in this example, we are talking five lines of code, millions of years of stepping over those five lines of code, then a result.
That is very much within comprehension, it’s just five lines, and the logic is understood.
We may also disagree, I claimed that the weights embed logic, you may think they only embed patterns, and that any appearance of logic is simply pulling from the latent space within the bounds of training data.
06.10.2025 22:15 — 👍 0 🔁 0 💬 0 📌 0I would argue that the random number generation is a characteristic necessary for the virtual machine to function, but that it would be simplistic to claim the system can be reduced down to simple random number generation.
06.10.2025 22:13 — 👍 1 🔁 0 💬 2 📌 0I googled it, looked for what seems to be a reliable source, then copied the number.
I could write a short algorithm which would do this manually, and keeping track of each prime’s index in a 64-bit int, but that may take some time to execute.
The only thing I am arguing against here is this claim:
“We know exactly how LLMs work”
We know how the virtual machine that runs and creates them works.
With limitless time, we could understand how their weights embed logic, but we currently don’t.
Do you disagree with any of these statements?
Last time I checked, 29,996,224,275,833
06.10.2025 22:03 — 👍 0 🔁 0 💬 1 📌 0Mate, I have spent years banging my head against walls to find ways to get these things to reliably perform the role of a junior software engineer, you don’t need to tell me that these things are dumb as rocks.
That doesn’t mean we understand their weights. Simple as.
I have at no point claimed it cannot be explained, simply that your assertion we already fully understand these systems is false.
They can be fully understood, the same way a modern CPU die can be, albeit with orders of magnitude more complexity than a billion-transistor die
“CAN be” is different than “is”
I agree, it can be.
I disagree that is is.
I am not claiming it is magic, I’m claiming our understanding of their internal functioning is so poor that, even when we can observe every byte of their weights, industry leaders admit they may as well be looking at a black box
06.10.2025 21:57 — 👍 2 🔁 0 💬 1 📌 0If we want to drop credentials I’ve built deep learning systems for NASA to use on the ISS, LLM’s are not necessarily beyond comprehension, but we do not currently comprehend them.
We do not comprehend how they are so effectively able to compress the corpus of human knowledge into gigabytes
Likewise, I can write a matrix multiplier, I can write a loss function, I can perform training runs, it doesn’t mean I understand the logic of the resulting weights, only how to execute them
06.10.2025 21:53 — 👍 0 🔁 0 💬 1 📌 0The logic written by humans in this case, matrix multiplication, is the equivalent of a virtual machine, executing the logic embedded in the weights.
I can write a 6502 emulator, it doesn’t mean I understand all of the programs executing on it
Why the hell do you think I’m arguing the systems are alive when I have never made that claim? I’m just addressing that it seems you think the logic for LLM’s is discrete and hand-written, like, for example, an MD5 hashing function.
06.10.2025 21:51 — 👍 0 🔁 0 💬 2 📌 0Again, the logic for the LinkedIn post isn’t in the algorithm running matrix multiplication, it is in the weights, there are two parallel systems of logic, and we do not understand how that logic works internally, only how to create it
06.10.2025 21:48 — 👍 1 🔁 0 💬 1 📌 0I’m not arguing anything about sentience, I’m just seeing a whole bunch of people acting like we program LLMs by hand, not that they were an accidental result of extreme overfitting resulting in outputs no one anticipated or even theorised
06.10.2025 21:46 — 👍 2 🔁 0 💬 1 📌 0