Big thanks to our amazing co-authors: Max Argus, Volker Fischer, and @thomasbrox.bsky.social π
20.04.2025 14:24 β π 0 π 0 π¬ 0 π 0@simonschrodi.bsky.social
π PhD student @cvisionfreiburg.bsky.social @UniFreiburg π‘ interested in mechanistic interpretability, robustness, AutoML & ML for climate science https://simonschrodi.github.io/
Big thanks to our amazing co-authors: Max Argus, Volker Fischer, and @thomasbrox.bsky.social π
20.04.2025 14:24 β π 0 π 0 π¬ 0 π 0Even better, if you're at #ICLR2025 next week:
πΌοΈ Poster β April 24, 10 a.m. - 12:30 p.m., Hall 3 + Hall 2B (#481)
π€ Oral β April 24, 4:30 p.m. - 4:42 p.m, Garnet 213β215 (oral session 2B)
β Or just catch us over coffee!
Curious to dive deeper?
π Paper: openreview.net/forum?id=uAF...
π» Code: github.com/lmb-freiburg...
π¬ DM me or David (he's not on Bluesky, but you can dm him on other platforms)!
But what is the modality gap good for? Interestingly, we find it affects the modelβs entropy suggesting it might not be a bug, but a feature. π
20.04.2025 14:24 β π 0 π 0 π¬ 1 π 0Our paper is packed with surprising and insightful findings about both phenomena. Most notably, we show that both effects stem from the information imbalance between image and text modalities and both can be reduced when that imbalance decreases.
20.04.2025 14:24 β π 0 π 0 π¬ 1 π 0In this work, we investigate two undesired properties of CLIP-like models:
- Modality gap: a complete separation of image and text embeddings in the shared embedding space.
- Object bias: a tendency to focus on objects over other semantic aspects like attributes.
David Hoffmann and I will present our joint work, βTwo Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models,β this Thursday as an Oral (top 1.8%) at #ICLR2025!π
π§΅
Hello world!
26.11.2024 09:39 β π 8 π 1 π¬ 0 π 0Hi Julian, could you please add me? I work on interpretability & data-centric ML for multi-modal models
24.11.2024 18:27 β π 0 π 0 π¬ 1 π 0A starter pack of people working on interpretability / explainability of all kinds, using theoretical and/or empirical approaches.
Reply or DM if you want to be added, and help me reach others!
go.bsky.app/DZv6TSS