Joint work with Sebastian Weichwald, Sebastien Lachapelle, and Luigi Gresele ๐
For more info, check the full paper ๐
arxiv.org/abs/2410.235...
@ema-ridopoco.bsky.social
Post-doc @ University of Trento. I did my PhD @ University of Trento and the University of Pisa. I like #concepts, #symbols, and #representations, but I still don't know what they are. ๐ Trento, Italy ๐งต #identifiability, #shortcuts, #interpretability
Joint work with Sebastian Weichwald, Sebastien Lachapelle, and Luigi Gresele ๐
For more info, check the full paper ๐
arxiv.org/abs/2410.235...
๐งตSummary
A mathematical proof that, under suitable conditions, linear properties hold for either all or none of the equivalent models with same next-token distribution ๐
Exciting open questions on empirical findings remain๐ค - check Section 6 (Discussion) in the paper!
8/9
3โฃ We demonstrate what linear properties are shared by all or none LLMs.
๐ฅ Under mild assumptions, relational linear properties are shared!
โ ๏ธ Parallel vectors may not be shared (they are under diversity)!
7/9
We also describe other linear properties: linear subspaces, probing, steering, based on relational strings (Paccanaro and Hinton, 2001).
๐กThey arise when the LLM can predict next-tokens for textual queries like: "What is the written language?" for many context strings!
6/9
2โฃ We reformulate linear properties of LLMs based on textual strings, depending on how LLMs predict next tokens
๐กParallel vectors arise from same log-ratios of next-token probs
E.g. same ratio for "easy"/"easiest" and "strong"/"strongest" in all contexts => parallel vecs
5/9
๐กThe extended linear equivalence underlies that two models' representations are linearly related, but in a subspace
โผ๏ธOutside that subspace, representations can differ a lot!
4/9
1โฃWe extend the results by Khemakem et al. (2020), Roeder et al. (2021), removing a diversity assumption.
For the first time, we relate models with different repr. dimensions & find that repr.s of LLMs with same distribution are related by an โextended linear equivalenceโ!
3/9
Contributions:
1โฃ An identifiability result for LLMs
2โฃA ๐ง๐๐ก๐๐ฉ๐๐ค๐ฃ๐๐ก reformulation of linear properties
3โฃ A proof of what properties are ๐๐ค๐ซ๐๐ง๐๐๐ฃ๐ฉ (~to Physics, cf. Villar et al. (2023)): hold for all or none of the LLMs with same next-token distribution
2/9
๐งตWhy are linear properties so ubiquitous in LLM representations?
We explore this question through the lens of ๐ถ๐ฑ๐ฒ๐ป๐๐ถ๐ณ๐ถ๐ฎ๐ฏ๐ถ๐น๐ถ๐๐:
โAll or None: Identifiable Linear Properties of Next-token Predictors in Language Modelingโ
Published at #AISTATS2025๐ด
1/9
@yanai.bsky.social this is very interesting!! FYI, we studied the ubiquity, rather than emergence, of linear relational properties here:
openreview.net/forum?id=XCm...
Now in Thailand to present our paper at #AISTATS2025 ๐น๐ญ๐ด
๐Today at 3:00-6:00 pm, poster number 118!
More details here:
openreview.net/forum?id=XCm...
Only yesterday I discovered that my PhD thesis has been made public to everyone ๐
I worked three years on "Learning concepts" and I tried to spot the connection between #concepts, #symbols, and #representations, and how they're used in ML today ๐พ๐ช
etd.adm.unipi.it/t/etd-012620...
Hey hey! We have an accepted paper at #AISTATS2025!!
Time to prepare for Thailand ๐ชท๐๏ธ๐ด๐
Huge thanks to my coauthors
Luigi Gresele, Sebastian Weichwald, and @seblachap.bsky.social for all the joint effort!
More details soon ๐
arxiv.org/abs/2410.235...
Don't miss the chance to learn more about our new #benchmark suite @ #NeurIPS2024
๐ New benchmarks to test concept quality learned by all kinds of models: Neural, NeSy, Concept-based, and Foundation models.
๐ค All models learn to solve the task but, beware, do they learn concepts??
Spoiler: ๐ฑ
๐ฃ Does your model learn high-quality #concepts, or does it learn a #shortcut?
Test it with our #NeurIPS2024 dataset & benchmark track paper!
rsbench: A Neuro-Symbolic Benchmark Suite for Concept Quality and Reasoning Shortcuts
What's the deal with rsbench? ๐งต
The #NeurIPS experience is about to start! โ
Drop me a line if you want to chat about #neurosymbolic reasoning #shortcuts, human-interpretable machine #concepts, logically-consistent #LLMs, or human-in-the-โฐ #XAI!
See you in Vancouver!
For those attending #NeurIPS2024 go to UNIREPS @unireps.bsky.social workshop to know more about representations similarity. Nice work led by @beatrixmgn.bsky.social ๐
06.12.2024 16:21 โ ๐ 10 ๐ 2 ๐ฌ 0 ๐ 0๐จ Interpretable AI often means sacrificing accuracyโbut what if we could have both? Most interpretable AI models, like Concept Bottleneck Models, force us to trade accuracy for interpretability.
But not anymore, due to Concept-Based Memory Reasoner (CMR)! #NeurIPS2024 (1/7)
Ask you to please add me :)
04.12.2024 08:23 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0We are the premier conference on #uncertainty in #AI and #ML since 1985 ๐ง
Hello, ๐ฆ!
Follow us to reduce uncertainty!
Then I agree ๐
02.12.2024 19:54 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0What is the precise definition of feature?
02.12.2024 19:33 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0I would like to ask for some back stabs to reviewer 2 ๐คฌ
28.11.2024 18:41 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0I know @looselycorrect.bsky.social well enough eheh
21.11.2024 18:41 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0A symbol is a "physical token" (Harnad 1990, arxiv.org/html/cs/9906...)
21.11.2024 18:20 โ ๐ 2 ๐ 0 ๐ฌ 2 ๐ 0Do you have answers? ๐
21.11.2024 18:14 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0