Kolja Bauer's Avatar

Kolja Bauer

@koljabauer.bsky.social

ELLIS PhD Student in Generative AI @ Ommer Lab (Stable Diffusion)

257 Followers  |  333 Following  |  2 Posts  |  Joined: 22.11.2024
Posts Following

Posts by Kolja Bauer (@koljabauer.bsky.social)

I’m thrilled to share that I’ll present two first-authored papers at #ICCV2025 🌺 in Honolulu together with @mgui7.bsky.social ! 🏝️
(Thread πŸ§΅πŸ‘‡)

18.10.2025 03:00 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1
Post image

πŸ€” What happens when you poke a scene β€” and your model has to predict how the world moves in response?

We built the Flow Poke Transformer (FPT) to model multi-modal scene dynamics from sparse interactions.

It learns to predict the π˜₯π˜ͺ𝘴𝘡𝘳π˜ͺ𝘣𝘢𝘡π˜ͺ𝘰𝘯 of motion itself πŸ§΅πŸ‘‡

15.10.2025 01:56 β€” πŸ‘ 24    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1
Our method pipeline

Our method pipeline

πŸ€”When combining Vision-language models (VLMs) with Large language models (LLMs), do VLMs benefit from additional genuine semantics or artificial augmentations of the text for downstream tasks?

🀨Interested? Check out our latest work at #AAAI25:

πŸ’»Code and πŸ“Paper at: github.com/CompVis/DisCLIP

πŸ§΅πŸ‘‡

08.01.2025 15:54 β€” πŸ‘ 15    πŸ” 8    πŸ’¬ 1    πŸ“Œ 0

In order to extract features from diffusion models, you have to noise your input and tune the noise level for each downstream task. But isn't there a better way? πŸ€”

Turns out there is, using our newly proposed feature extraction method CleanDIFT πŸ§ΉπŸš€

Check it out ⬇️

05.12.2024 07:58 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Hi, I recently started as an ELLIS PhD student at BjΓΆrn Ommer's lab. I would be happy to be on the list as well :)

27.11.2024 14:27 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

After many years, our lab finally has a social media presence at @compvis.bsky.social ! πŸ₯³
Give it a follow, we have some amazing research on generative computer vision coming soon!

20.11.2024 18:31 β€” πŸ‘ 19    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0