Nick Stracke's Avatar

Nick Stracke

@rmsnorm.bsky.social

PhD Student at Ommer Lab (Stable Diffusion) Trying to understand motion... 🌐 https://nickstracke.dev

661 Followers  |  280 Following  |  9 Posts  |  Joined: 18.11.2024  |  1.5966

Latest posts by rmsnorm.bsky.social on Bluesky

Two great works on how we can manipulate style for generative modeling by PiMa!

18.10.2025 08:37 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ€” What happens when you poke a scene β€” and your model has to predict how the world moves in response?

We built the Flow Poke Transformer (FPT) to model multi-modal scene dynamics from sparse interactions.

It learns to predict the π˜₯π˜ͺ𝘴𝘡𝘳π˜ͺ𝘣𝘢𝘡π˜ͺ𝘰𝘯 of motion itself πŸ§΅πŸ‘‡

15.10.2025 01:56 β€” πŸ‘ 24    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1
Our method pipeline

Our method pipeline

πŸ€”When combining Vision-language models (VLMs) with Large language models (LLMs), do VLMs benefit from additional genuine semantics or artificial augmentations of the text for downstream tasks?

🀨Interested? Check out our latest work at #AAAI25:

πŸ’»Code and πŸ“Paper at: github.com/CompVis/DisCLIP

πŸ§΅πŸ‘‡

08.01.2025 15:54 β€” πŸ‘ 15    πŸ” 8    πŸ’¬ 1    πŸ“Œ 0

And thanks for the kind words ! :)

09.12.2024 11:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It was due to a compute constraint at that time. We will update it with numbers run on the complete test set once we release a new version of the paper.

09.12.2024 11:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

We make code and cleaned 🧹 weights available for SD 1.5 and SD 2.1.

Have a look now!
πŸ“ Paper: compvis.github.io/cleandift/st...
πŸ’» Code: github.com/CompVis/clea...
πŸ€— Hugging Face: huggingface.co/CompVis/clea...

04.12.2024 23:31 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

We show you can, with just 30 minutes of task-agnostic finetuning on a single GPU. 🀯

No noise. Better features. Better performance. Across many tasks.

And no timestep searching headaches! πŸ‘‡

04.12.2024 23:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

They need noisy images as input - and the right noise level for each task.
So we have to find the right timestep for every downstream task? 🀯

What if you could ditch all of that? πŸ‘‡

04.12.2024 23:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This work was co-led by @stefanabaumann.bsky.social and @koljabauer.bsky.social.

✨ Diffusion models are amazing at learning world representations. Their features power many tasks:
β€’ Semantic correspondence
β€’ Depth estimation
β€’ Semantic segmentation
… and more!

But here’s the catch βš‘οΈπŸ‘‡

04.12.2024 23:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ€” Why do we extract diffusion features from noisy images? Isn’t that destroying information?

Yes, it is - but we found a way to do better. πŸš€

Here’s how we unlock better features, no noise, no hassle.

πŸ“ Project Page: compvis.github.io/cleandift
πŸ’» Code: github.com/CompVis/clea...

πŸ§΅πŸ‘‡

04.12.2024 23:31 β€” πŸ‘ 42    πŸ” 10    πŸ’¬ 2    πŸ“Œ 5
Post image

me right now..

20.11.2024 14:22 β€” πŸ‘ 48    πŸ” 3    πŸ’¬ 4    πŸ“Œ 0
Post image

Hi, just sharing an updated version of the PyTorch 2 Internals slides: drive.google.com/file/d/18YZV.... Content: basics, jit, dynamo, Inductor, export path and executorch. This is focused on internals so you will need a bit of C/C++. I show how you can export and run a model on a Pixel Watch too.

19.11.2024 11:05 β€” πŸ‘ 87    πŸ” 17    πŸ’¬ 2    πŸ“Œ 1
[EEML'24] Sander Dieleman - Generative modelling through iterative refinement
YouTube video by EEML Community [EEML'24] Sander Dieleman - Generative modelling through iterative refinement

While we're starting up over here, I suppose it's okay to reshare some old content, right?

Here's my lecture from the EEML 2024 summer school in Novi SadπŸ‡·πŸ‡Έ, where I tried to give an intuitive introduction to diffusion models: youtu.be/9BHQvQlsVdE

Check out other lectures on their channel as well!

19.11.2024 09:57 β€” πŸ‘ 115    πŸ” 12    πŸ’¬ 3    πŸ“Œ 0

@rmsnorm is following 19 prominent accounts