Thanks to Daniil Zverev*, @thwiedemer.bsky.social*, @bayesiankitten.bsky.social, Matthias Bethge (@bethgelab.bsky.social), and @wielandbrendel.bsky.social for making VGGSound sounder! π π π
21.10.2025 18:08 β π 2 π 0 π¬ 0 π 0@askoepke.bsky.social
Junior research group leader at TUM | University of TΓΌbingen. Currently at BAIR (Berkeley). Previously at VGG (Oxford). Interested in multi-modal learning. π https://akoepke.github.io/
Thanks to Daniil Zverev*, @thwiedemer.bsky.social*, @bayesiankitten.bsky.social, Matthias Bethge (@bethgelab.bsky.social), and @wielandbrendel.bsky.social for making VGGSound sounder! π π π
21.10.2025 18:08 β π 2 π 0 π¬ 0 π 0π With VGGSounder, we show that existing models donβt always benefit from multimodal input and sometimes performance even degrades.
Code and data: vggsounder.github.io
VGGSounder is a new video classification benchmark for audio-visual foundation models:
We provide:
π’ Re-annotated VGGSound test set
π’ Modality-specific manual labels
π’ A modality confusion metric to diagnose when models misuse modalities
Paper: arxiv.org/pdf/2508.08237
π Excited to present our paper VGGSounder: AudioβVisual Evaluations for Foundation Models today at #ICCV2025!
π¦ Poster Session 1 | 11:30β13:30
π Poster #88
Come by if you're into audio-visual learning and want to know whether multiple modalities actually help or hurt.
Thanks to @munichcenterml.bsky.social for supporting the workshop with a best paper award (announced at 2.50pm CDT)!
11.06.2025 17:59 β π 1 π 0 π¬ 0 π 0We have fantastic speakers, including @saining.bsky.social, @aidanematzadeh.bsky.social, @ranjaykrishna.bsky.social, Ludwig Schmidt, @lisadunlap.bsky.social, and Ishan Misra.
11.06.2025 17:57 β π 0 π 0 π¬ 0 π 0Our #CVPR2025 workshop on Emergent Visual Abilities and Limits of Foundation Models (EVAL-FoMo) is taking place this afternoon (1-6pm) in room 210.
Workshop schedule: sites.google.com/view/eval-fo...
Screenshot of the workshop website "Emergent Visual Abilities and Limits of Foundation Models" at CVPR 2025
Our paper submission deadline for the EVAL-FoMo workshop @cvprconference.bsky.social has been extended to March 19th!
sites.google.com/view/eval-fo...
We welcome submissions (incl. published papers) on the analysis of emerging capabilities / limits in visual foundation models. #CVPR2025
Our 2nd Workshop on Emergent Visual Abilities and Limits of Foundation Models (EVAL-FoMo) is accepting submissions. We are looking forward to talks by our amazing speakers that include @saining.bsky.social, @aidanematzadeh.bsky.social, @lisadunlap.bsky.social, and @yukimasano.bsky.social. #CVPR2025
13.02.2025 16:02 β π 7 π 3 π¬ 0 π 1Upcoming π ππ»πΆπ°π΅ ππ ππ²π°πππΏπ² featuring Prof. Franca Hoffmann from California Institute of Technology and Prof. Holger Hoos from RWTH Aachen University: munichlectures.ai
ποΈ December 17, 2024
π 16:00 CET
π« Senatssaal, #LMU Munich
Kicking off our TUM AI - Lecture Series tomorrow with none other than Jiaming Song, CSO @LumaLabsAI.
He'll be talking about "Dream Machine: Emergent Capabilities from Video Foundation Models".
Live stream: youtu.be/oilWwsXZamA
7pm GMT+1 / 10am PST (Mon Dec 2nd)