Nice start of @neuripsconf.bsky.social!
Our work with @francescortu.bsky.social and @diegodoimo.bsky.social on the Competition of Mechanisms to understand counterfactuality in LLMs featured in the "Causality for LLMs" workshop :-)
Check out our ACL2024 paper aclanthology.org/2024.acl-long.โฆ
10.12.2024 20:19 โ ๐ 9 ๐ 1 ๐ฌ 0 ๐ 0
Thanks again, @diegodoimo.bsky.social and @albecazzaniga.bsky.social , for the fantastic mentorship and support! ๐๐ They are also attending #NeurIPS, so feel free to reach out to them to discuss our results. Iโm excited to keep pushing forward on these topics! ๐
10.12.2024 20:10 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Thanks to the amazing team at LADE @areasciencepark: @lvaleriani.bsky.social @lbasile.bsky.social @AlessioAnsuini @diegodoimo.bsky.social @albecazzaniga.bsky.social ๐
10.12.2024 20:10 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
It was super fun to take our first step in interpreting multimodal LLMs, working closely with the brilliant @alexpietroserra.bsky.social and @EmanuelePanizon
10.12.2024 20:10 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
โ
This shows that, starting from the mid-layers, a single token effectively summarizes all 1024 image tokens!
โ This does not occur in models fine-tuned for visual understanding (such as Pixtral).
10.12.2024 20:10 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Additionally, blocking communication from this token significantly disrupts performance on standard benchmarks, while blocking image-text communication does not
10.12.2024 20:10 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
๐ฏ Key finding: In these models the hidden representations of images and text form disjoint clusters and the communication between modalities is mediated by the special token <end-of-image>!
10.12.2024 20:10 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
๐ Check out our code and data at: ritareasciencepark.github.io/Narrow-gate
10.12.2024 20:10 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
๐จ ๐จ Excited to share our latest paper, now on #arXiv!
๐ผ๏ธ We studied how unified VLMs, trained to generate both text and images (e.g., Meta's Chameleon), exchange information between modalities, comparing them to standard VLMs.
๐ Paper: arxiv.org/abs/2412.06646
Deep dive: ๐
10.12.2024 20:10 โ ๐ 9 ๐ 2 ๐ฌ 1 ๐ 2
Screenshot of the paper.
Even as an interpretable ML researcher, I wasn't sure what to make of Mechanistic Interpretability, which seemed to come out of nowhere not too long ago.
But then I found the paper "Mechanistic?" by
@nsaphra.bsky.social and @sarah-nlp.bsky.social, which clarified things.
20.11.2024 08:00 โ ๐ 232 ๐ 28 ๐ฌ 8 ๐ 2
Thanks for creating the starter pack! I'd love to be added as well! ๐
20.11.2024 10:41 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
PhD student in machine learning at DTU, Copenhagen.
Especially interested in model representations.
PhD at EPFL ๐ง ๐ป
Ex @MetaAI, @SonyAI, @Microsoft
Egyptian ๐ช๐ฌ
Sentence processing modeling | Computational psycholinguistics | 1st year PhD student at LLF, CNRS, Universitรฉ Paris Citรฉ | Currently visiting COLT, Universitat Pompeu Fabra, Barcelona, Spain
https://ninanusb.github.io/
The largest workshop on analysing and interpreting neural networks for NLP.
BlackboxNLP will be held at EMNLP 2025 in Suzhou, China
blackboxnlp.github.io
Assistant professor of computer science at Technion
https://belinkov.com/
Carpe espresso โ
Associate Professor in Machine Learning
Manchester Centre for AI FUNdamentals
Department of Computer Science
The University of Manchester
Alumn UCL, DeepMind, U Alberta, PUCP
Deep Thinker.
Posts / reposts might be non-deep.
Assi. Prof @UofTCompSci. Postdoc @MPI_IS w/ @bschoelkopf. Research on (1) @CausalNLP and (2) NLP4SocialGood @NLP4SG. Mentor & mentee @ACLMentorship.
Janice M. Jenkins Collegiate Professor of Computer Science at U. Michigan, Director Michigan AI Lab, Former ACL President, AAAI Fellow, ACM Fellow. Researcher #NLProc #AI
๐ https://web.eecs.umich.edu/~mihalcea/
PhD Student in Colt UPF
https://mahautm.github.io/
PhD Student at the Max Planck Institute for Informatics @cvml.mpi-inf.mpg.de @maxplanck.de | Explainable AI, Computer Vision, Neuroexplicit Models
Web: sukrutrao.github.io
๐จโ๐ป NLP PhD Student @ukplab.bsky.social
Studying genomics, machine learning, and fruit. My code is like our genomes -- most of it is junk.
Guest Scientist IMP Vienna, Board of Directors NumFOCUS
Incoming Prof UMass Chan Medical
Previously Stanford Genetics, UW CSE.
Bioinformatics Scientist / Next Generation Sequencing, Single Cell and Spatial Biology, Next Generation Proteomics, Liquid Biopsy, SynBio, Compute Acceleration in biotech // http://albertvilella.substack.com
https://ellis-jena.eu is developing+applying #AI #ML in #earth system, #climate & #environmental research.
Partner: @uni-jena.de, https://bgc-jena.mpg.de/en, @dlr-spaceagency.bsky.social, @carlzeissstiftung.bsky.social, https://aiforgood.itu.int
Professor of English Linguistics, UCL
Here, I post on (English) language topics.
On Substack, I post on English Grammar: https://basaarts.substack.com/
#grammar #syntax #parsing
1st year PhD Student at @gronlp.bsky.social ๐ฎ - University of Groningen
Language Acquisition - NLP
Foundational Research Lead @ Thomson Reuters | Advisor @ UK AISI | ex- Fellow @ Harvard, ex- Senior RS @ GoogleDeepMind | PhD @ UCL / Gatsby Unit ๐ฌ๐ง๐ฉ๐ช๐บ๐ฒ๐ญ๐ฐ
https://jonathan-schwarz.github.io/
Postdoc @ Stanford University
https://koloskova.github.io/