bskyView

Kolja Bauer

@koljabauer.bsky.social

ELLIS PhD Student in Generative AI @ Ommer Lab (Stable Diffusion)

257 Followers | 333 Following | 2 Posts | Joined: 22.11.2024

Posts Following

Posts by Kolja Bauer (@koljabauer.bsky.social)

I’m thrilled to share that I’ll present two first-authored papers at #ICCV2025 🌺 in Honolulu together with @mgui7.bsky.social ! 🏝️
(Thread 🧵👇)

18.10.2025 03:00 — 👍 5 🔁 3 💬 1 📌 1

🤔 What happens when you poke a scene — and your model has to predict how the world moves in response?

We built the Flow Poke Transformer (FPT) to model multi-modal scene dynamics from sparse interactions.

It learns to predict the 𝘥𝘪𝘴𝘵𝘳𝘪𝘣𝘶𝘵𝘪𝘰𝘯 of motion itself 🧵👇

15.10.2025 01:56 — 👍 24 🔁 8 💬 1 📌 1

Our method pipeline

🤔When combining Vision-language models (VLMs) with Large language models (LLMs), do VLMs benefit from additional genuine semantics or artificial augmentations of the text for downstream tasks?

🤨Interested? Check out our latest work at #AAAI25:

💻Code and 📝Paper at: github.com/CompVis/DisCLIP

🧵👇

08.01.2025 15:54 — 👍 15 🔁 8 💬 1 📌 0

In order to extract features from diffusion models, you have to noise your input and tune the noise level for each downstream task. But isn't there a better way? 🤔

Turns out there is, using our newly proposed feature extraction method CleanDIFT 🧹🚀

Check it out ⬇️

05.12.2024 07:58 — 👍 6 🔁 0 💬 0 📌 0

Hi, I recently started as an ELLIS PhD student at Björn Ommer's lab. I would be happy to be on the list as well :)

27.11.2024 14:27 — 👍 2 🔁 0 💬 1 📌 0

After many years, our lab finally has a social media presence at @compvis.bsky.social ! 🥳
Give it a follow, we have some amazing research on generative computer vision coming soon!

20.11.2024 18:31 — 👍 19 🔁 2 💬 0 📌 0