Oh yeah, sorry, I should've made it more clear that I was talking in the more general case
03.10.2025 18:48 β π 1 π 0 π¬ 0 π 0@stefanabaumann.bsky.social
PhD Student at @compvis.bsky.social & @ellis.eu working on generative computer vision. Interested in extracting world understanding from models and more controlled generation. π https://stefan-baumann.eu/
Oh yeah, sorry, I should've made it more clear that I was talking in the more general case
03.10.2025 18:48 β π 1 π 0 π¬ 0 π 0Let's for example say (zero-shot) semantic correspondence working quite well based on activations of image diffusion models.
The model has never been trained for it, and, while it's obvious that related capabilities might be useful for denoising, I'd still consider this an emergent capability
Not in the sense of, e.g., generating new kinds of videos when the model was trained for video generation, but capabilities w.r.t. other tasks could still be considered emergent, right?
03.10.2025 18:43 β π 0 π 0 π¬ 2 π 0Fair :D
18.09.2025 15:21 β π 0 π 0 π¬ 0 π 0First time I ever hear someone from the 3D CV community actually say this out loud! This has been bugging me for a long time
18.09.2025 14:48 β π 3 π 0 π¬ 1 π 0Ah, makes sense :)
11.09.2025 13:18 β π 0 π 0 π¬ 0 π 0Why are you not on a current stable version?
11.09.2025 11:59 β π 0 π 0 π¬ 1 π 0The bugs I ran into reproduce across 2.7, 2.8 and current nightlies
11.09.2025 11:35 β π 1 π 0 π¬ 2 π 0Welcome to the club! I've somehow managed to find two bugs with torch.compile() in the last few days π₯²
10.09.2025 23:26 β π 1 π 0 π¬ 1 π 0βEveryone knowsβ what an autoencoder isβ¦ but there's an important complementary picture missing from most introductory material.
In short: we emphasize how autoencoders are implementedβbut not always what they represent (and some of the implications of that representation).π§΅
That process really sounds like a labor of love! Penrose looks really interesting, I'll play around with it! Thanks!
31.08.2025 16:47 β π 2 π 0 π¬ 0 π 0Any tips on how to create such nice figures?
Specifically, I can judge whether a figure is nice, but struggle seeing how to get there. That, combined with using tools like TikZ, where I have to change most settings to get something decent-looking makes it quite hard to get great results for me
Especially in fields with alphabetical ordering π
20.08.2025 11:29 β π 0 π 0 π¬ 1 π 0I've never gotten that far, but had actually good experiment suggestions in at least 5 reviews
17.08.2025 10:37 β π 2 π 0 π¬ 0 π 0Even as an author, I actually have experienced this from multiple reviewers by now. In the first moment, it's annoying of course, but in the long term, it can make the paper better. I think I'm around 10-15% of reviews I got being actually really good and constructive in a useful way
16.08.2025 17:23 β π 2 π 0 π¬ 1 π 0One thing I forgot: sometimes, reviewers think of great experiments that significantly strengthen the paper. I had one amazing reviewer at CVPR that suggested multiple, and I'd like to think that some of the additional evals I suggest also fall in that category. We should make sure we retain that
16.08.2025 13:16 β π 3 π 0 π¬ 1 π 0Worst of both worlds: requesting multiple substantial additional experiments and then rejecting the paper based on "insufficient experimental evaluation" after the authors somehow managed to perform all the requested experiments
16.08.2025 10:14 β π 3 π 0 π¬ 1 π 0The question is: what is better: rejecting a paper because of insufficient experiments outright, or hoping that the authors somehow manage to put together the experiments in the short timespan available
16.08.2025 10:12 β π 3 π 0 π¬ 2 π 0Happy Birthday Kosta!
15.08.2025 15:01 β π 1 π 0 π¬ 0 π 0Conference hotels just seem to be significantly more expensive than other reasonable alternatives, even at the discounted rates
10.08.2025 12:16 β π 4 π 0 π¬ 1 π 0Same for us in my lab (and likely the rest of German universities): we can book mostly whatever, we just have to show it's the cheapest reasonable option available (unless it's below a very low location-specific threshold for things like hotel rooms).
10.08.2025 12:15 β π 3 π 0 π¬ 1 π 0Congrats!
06.08.2025 11:28 β π 1 π 0 π¬ 0 π 0Did you also happen to participate in creating LLM preference annotations?
05.08.2025 06:39 β π 2 π 0 π¬ 1 π 0As an author, I honestly prefer forum-style comments over one-page rebuttals (as long as we get some way to include figures). As a reviewer, I prefer a single page
01.08.2025 11:08 β π 3 π 0 π¬ 0 π 0tl;dr: do importance weighting/sampling on a sequence level, not a token level.
Makes everything behave much better (see below) and makes more sense from a theoretical perspective, too.
Paper: www.arxiv.org/abs/2507.18071
I'm calling it now, GSPO will be the next big hype in LLM RL algos after GRPO.
It makes so much more sense intuitively to work on a sequence rather than on a token level when our rewards are on a sequence level.
Absolutely 100% this. Who would want to read papers like VGG-T
25.07.2025 09:41 β π 3 π 0 π¬ 1 π 0Genie did this in a really cool manner: arxiv.org/abs/2402.15391
03.07.2025 16:11 β π 2 π 0 π¬ 0 π 0I don't think the implicit assumptions are problematic likely, as long as the frequency range is reasonable. Keep in mind that we add an MLP afterwards that can freely learn to modulate the model with different frequencies
25.06.2025 12:16 β π 0 π 0 π¬ 1 π 0approaches, but I don't think I've seen this in public yet. I considered doing it a while ago, but I never found a good justification to spend the time carefully ablating something like this. It might lead to some cool interpretable insights into the model's behavior across time though (3/3)
25.06.2025 07:02 β π 0 π 0 π¬ 1 π 0