Victor Morand's Avatar

Victor Morand

@victormorand.bsky.social

PhD Student @ Sorbonne Université (@mlia-isir.bsky.social) Research in information retrieval and conversational search: Towards Language models that know what they know 🧠 Homepage : victormorand.github.io

8 Followers  |  23 Following  |  6 Posts  |  Joined: 10.03.2025  |  1.4202

Latest posts by victormorand.bsky.social on Bluesky

👀Up next
Building upon these findings, we've managed to externalize this internal mechanism, creating a general-purpose mention detector with promising results. Stay tuned! 🔜

22.10.2025 08:16 — 👍 0    🔁 0    💬 0    📌 0

I'll be presenting this work at @blackboxnlp.bsky.social in Suzhou, happy to chat there or here if you are interested !

22.10.2025 08:16 — 👍 0    🔁 0    💬 1    📌 0
Post image

3️⃣ The Entity Lens
Our method enables reconstruction of entity mentions from any representation within LLMs, allowing to ask: “What entity is the model thinking about right now?”  
💡 When reading ‘the City of Lights iconic monument’, the model internally “thinks” of Paris and the Eiffel Tower !

22.10.2025 08:16 — 👍 0    🔁 0    💬 1    📌 0

2️⃣ LLMs develop entity-specific mechanisms.

By sucessfully learning "Tasks Vectors" steering the model to reconstruct the mention, we uncover new evidence that LLMs form dedicated internal circuits to represent and manipulate multi-token entities.

22.10.2025 08:16 — 👍 0    🔁 0    💬 1    📌 0

1️⃣ Common entities are (almost) part of the Vocabulary.

We prove that common multi-token mentions (e.g. "Eiffel Tower") can be recovered from the middle-layer hidden state of its last token only !
Uncommon mentions aren't fully encoded this way; but rather retrieved from the context when needed.

22.10.2025 08:16 — 👍 0    🔁 0    💬 1    📌 0
Post image

New paper at @blackboxnlp.bsky.social @ @emnlpmeeting.bsky.social !
⚛️ Entities are the fundamental building blocks of knowledge. Although some clues emerge from mechanistic interpretability, how auto-regressive LLMs actually encode and retrieve them remains a mystery. 🧵

📄 arxiv.org/abs/2510.09421

22.10.2025 08:16 — 👍 1    🔁 1    💬 1    📌 0

@victormorand is following 20 prominent accounts