A discussion on the philosophy of deep learning, mechanistic interpretability and the epistemology of LLMs. @pierrebeckmann.bsky.social @matthieu-queloz.bsky.social youtu.be/1_0ttM8zp9o?...
10.01.2026 22:55 — 👍 2 🔁 1 💬 0 📌 1@pierrebeckmann.bsky.social
DL researcher who turned to philosphy. Epistemology of AI.
A discussion on the philosophy of deep learning, mechanistic interpretability and the epistemology of LLMs. @pierrebeckmann.bsky.social @matthieu-queloz.bsky.social youtu.be/1_0ttM8zp9o?...
10.01.2026 22:55 — 👍 2 🔁 1 💬 0 📌 1One of the best discussions of AI I've seen in a while, because it's deeply informed by philosophy AND computer science. LLM’s are more than just “stochastic parrots”, but their understanding is still nonhuman. The discussion of concepts, understanding, and world models is especially informative.
12.01.2026 01:43 — 👍 3 🔁 2 💬 1 📌 0Find our more in the paper:
link.springer.com/article/10.1...
This is because deep learning models learn to form putative connections concerning the domain they are trained on. This grasp of connections is essential for explanatory and objectual understanding.
28.11.2025 14:25 — 👍 1 🔁 0 💬 1 📌 0I thus synthesise this literature into a set of conditions for understanding-of-the-world and submit it to SORA and deep learning models in general.
I conclude that deep learning models are capable of such understanding!
In recent epistemology literature, philosophers work with the concepts of explanatory and objectual understanding. I've found these to be more appropriate to tackle the question of SORA's understanding than the typical semantic understanding often discussed for LLMs.
28.11.2025 14:25 — 👍 0 🔁 0 💬 1 📌 0Does SORA "understand" the world? For example, does it understand the movement of the ship in the coffee cup below?
In my latest Synthese article I tackle this question!
We’ve recently updated our collaborative open-access book, “Neural Networks in Cognitive Science”, adding a few new authors, chapters, and lots of content.
downloads.jeffyoshimi.net/NeuralNetwor...
Curious? Read the full paper: arxiv.org/abs/2507.08017
It doubles as an accessible introduction to the field of mechanistic interpretability! (9/9)
In short, LLMs build internal structures that echo human understanding—relying on concepts, facts, and principles. But their “understanding” is fundamentally alien: sprawling, parallel, and unconcerned with simplicity.
Philosophy of AI now needs to forge conceptions that fit them. (8/9)
Strange minds.
LLMs exhibit the phenomenon of parallel mechanisms: instead of relying on a single unified process, they solve problems by deploying many distinct heuristics in parallel. This approach stands in stark contrast to the parsimony typical of human understanding. (7/9)
Level 3: Principled understanding
At this last tier, LLMs can grasp the underlying principles that connect and unify a diverse array of facts.
Research on tasks like modular addition provides cases where LLMs move beyond memorizing examples to internalizing general rules. (6/9)
But LLMs aren’t limited to static facts—they can also track dynamic states.
OthelloGPT, a GPT-2 model trained on legal Othello moves, encodes the board state in internal representations that update as the game unfolds, as shown by linear probes. (5/9)
Level 2: State-of-the-world understanding
LLMs can encode factual associations in the linear projections of their MLP layers.
For instance, they can ensure that a strong activation of the “Golden Gate Bridge” feature leads to a strong activation of the “in SF” feature. (4/9)
How does the model use these features?
Attention layers are key. They retrieve relevant information from earlier tokens and integrate it into the current token’s representation, making the model context-aware. (3/9)
Level 1: Conceptual understanding
Emerges when a model forms “features” as directions in latent space, allowing it to recognize and unify diverse manifestations of an entity or a property.
E.g., LLMs subsume “SF’s landmark” or “orange bridge” under a “Golden Gate Bridge” feature.
New preprint: “Mechanistic Indicators of Understanding in LLMs” with @matthieu-queloz.bsky.social
Building on mechanistic interpretability, we argue that LLMs exhibit signs of understanding—across three tiers: conceptual –, state-of-the-world –, and principled understanding. 🧵(1/9)