Francesco Ortu's Avatar

Francesco Ortu

@francescortu.bsky.social

NLP & Interpretability | PhD Student @ University of Trieste & Laboratory of Data Engineering of Area Science Park | Prev MPI-IS

478 Followers  |  1,042 Following  |  9 Posts  |  Joined: 19.11.2024  |  1.7435

Latest posts by francescortu.bsky.social on Bluesky

Post image

Nice start of @neuripsconf.bsky.social!

Our work with @francescortu.bsky.social and @diegodoimo.bsky.social on the Competition of Mechanisms to understand counterfactuality in LLMs featured in the "Causality for LLMs" workshop :-)

Check out our ACL2024 paper aclanthology.org/2024.acl-long.โ€ฆ

10.12.2024 20:19 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thanks again, @diegodoimo.bsky.social and @albecazzaniga.bsky.social , for the fantastic mentorship and support! ๐Ÿ™๐ŸŽ‰ They are also attending #NeurIPS, so feel free to reach out to them to discuss our results. Iโ€™m excited to keep pushing forward on these topics! ๐Ÿš€

10.12.2024 20:10 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thanks to the amazing team at LADE @areasciencepark: @lvaleriani.bsky.social @lbasile.bsky.social @AlessioAnsuini @diegodoimo.bsky.social @albecazzaniga.bsky.social ๐Ÿ™

10.12.2024 20:10 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

It was super fun to take our first step in interpreting multimodal LLMs, working closely with the brilliant @alexpietroserra.bsky.social and @EmanuelePanizon

10.12.2024 20:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

โœ… This shows that, starting from the mid-layers, a single token effectively summarizes all 1024 image tokens!

โŒ This does not occur in models fine-tuned for visual understanding (such as Pixtral).

10.12.2024 20:10 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Additionally, blocking communication from this token significantly disrupts performance on standard benchmarks, while blocking image-text communication does not

10.12.2024 20:10 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽฏ Key finding: In these models the hidden representations of images and text form disjoint clusters and the communication between modalities is mediated by the special token <end-of-image>!

10.12.2024 20:10 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

๐ŸŒ Check out our code and data at: ritareasciencepark.github.io/Narrow-gate

10.12.2024 20:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿšจ ๐Ÿšจ Excited to share our latest paper, now on #arXiv!

๐Ÿ–ผ๏ธ We studied how unified VLMs, trained to generate both text and images (e.g., Meta's Chameleon), exchange information between modalities, comparing them to standard VLMs.

๐Ÿ“„ Paper: arxiv.org/abs/2412.06646

Deep dive: ๐Ÿ‘‡

10.12.2024 20:10 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2
Screenshot of the paper.

Screenshot of the paper.

Even as an interpretable ML researcher, I wasn't sure what to make of Mechanistic Interpretability, which seemed to come out of nowhere not too long ago.

But then I found the paper "Mechanistic?" by
@nsaphra.bsky.social and @sarah-nlp.bsky.social, which clarified things.

20.11.2024 08:00 โ€” ๐Ÿ‘ 232    ๐Ÿ” 28    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 2

Thanks for creating the starter pack! I'd love to be added as well! ๐Ÿ˜Š

20.11.2024 10:41 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@francescortu is following 19 prominent accounts