Jan Eric Lenssen's Avatar

Jan Eric Lenssen

@janericlenssen.bsky.social

Senior Researcher at Max Planck Institute for Informatics Founding Engineer at Kumo.ai

192 Followers  |  358 Following  |  13 Posts  |  Joined: 30.11.2024  |  2.0677

Latest posts by janericlenssen.bsky.social on Bluesky

Do you need to upsample vision features (e.g. DINOv3) to higher resolutions?

๐Ÿ‘‰ If yes, check out Thomas Wimmer's new work AnyUp, it works on all vision features without retraining and can upsample to any resolution! Code is available!

16.10.2025 11:51 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
๐ŸŒ€Spatial Reasoners

PS: ๐ŸŒ€ We recently released spatial-reasoners, a general toolkit to apply SRMs to a wide range of different domains: spatialreasoners.github.io

14.07.2025 20:42 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

We find that model hallucination can be drastically reduced by choosing the right configuration, allowing to significantly increase performance in complex reasoning tasks like solving visual Sudoku.

14.07.2025 20:42 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Spatial Reasoning with Denoising Models Spatial Reasoning with Denoising Models.

Our Spatial Reasoning Models allow to explore the space between parallel and autoregressive diffusion models with different methods for choosing generation order.

Project Website: geometric-rl.mpi-inf.mpg.de/srm/

14.07.2025 20:41 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

Can diffusion models solve visual Sudoku?

If you are at #ICML2025, come to our poster in the Wednesday morning poster session (Poster Session 3 East, Poster 3412) and find out!

@chriswewer.bsky.social

14.07.2025 20:41 โ€” ๐Ÿ‘ 11    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

MET3R quantitatively measures 3D consistency between two images via DUSt3R reconstruction and feature comparison. It does not require camera poses.

Code is available for plug-and-play use. We also provide an open source multi-view latent diffusion model for further research!

12.06.2025 22:39 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
MEt3R Measuring Multi-View Consistency in Generated Images.

Project page: geometric-rl.mpi-inf.mpg.de/met3r/

12.06.2025 22:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

At #CVPR2025 and working on consistency in video and multi-view generative models?

Come and visit our poster on Friday afternoon, where I present ๐— ๐—˜๐˜๐Ÿฏ๐—ฅ: ๐— ๐—ฒ๐—ฎ๐˜€๐˜‚๐—ฟ๐—ถ๐—ป๐—ด ๐— ๐˜‚๐—น๐˜๐—ถ-๐—ฉ๐—ถ๐—ฒ๐˜„ ๐—–๐—ผ๐—ป๐˜€๐—ถ๐˜€๐˜๐—ฒ๐—ป๐—ฐ๐˜† ๐—ถ๐—ป ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐—ฑ ๐—œ๐—บ๐—ฎ๐—ด๐—ฒ๐˜€

@mohammadasim98.bsky.social @wimmerthomas.bsky.social @mpi-inf.mpg.de @cvml.mpi-inf.mpg.de

12.06.2025 22:38 โ€” ๐Ÿ‘ 17    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

We also show that good orders can be predicted by uncertainty, which is crucial for the Sudoku task to be solved well.

03.03.2025 14:16 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Spatial Reasoning Models (SRMs) are a framework to propagate belief over a set of continuous variables (e.g. image patches) with generative denoising models.

It allows to explore the amount of (soft) sequentialization and the order of generation, both having significant impact on reasoning quality.

03.03.2025 14:16 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Can image generators solve visual Sudoku?

Naively, no, with sequentialization and the correct order, they can!

Check out @chriswewer.bsky.social's and Bart's SRM's for details.

Project: geometric-rl.mpi-inf.mpg.de/srm/
Paper: arxiv.org/abs/2502.21075
Code: github.com/Chrixtar/SRM

03.03.2025 14:14 โ€” ๐Ÿ‘ 12    ๐Ÿ” 2    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

MET3R measures 3D consistency between two images without camera poses via DUSt3R reconstruction and feature comparison.

Code is available for plug-and-play use. We also provide an open source multi-view latent diffusion model for further research!

Project page: geometric-rl.mpi-inf.mpg.de/met3r/

15.01.2025 18:31 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Hello bluesky-world :)

Introducing ๐— ๐—˜๐˜๐Ÿฏ๐—ฅ: ๐— ๐—ฒ๐—ฎ๐˜€๐˜‚๐—ฟ๐—ถ๐—ป๐—ด ๐— ๐˜‚๐—น๐˜๐—ถ-๐—ฉ๐—ถ๐—ฒ๐˜„ ๐—–๐—ผ๐—ป๐˜€๐—ถ๐˜€๐˜๐—ฒ๐—ป๐—ฐ๐˜† ๐—ถ๐—ป ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐—ฑ ๐—œ๐—บ๐—ฎ๐—ด๐—ฒ๐˜€.

Lacking 3D consistency in generated images is a limitation of many current multi-view/video/world generative models. To quantitatively measure these inconsistencies, check out Mohammad Asims new work!

15.01.2025 18:30 โ€” ๐Ÿ‘ 24    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

@janericlenssen is following 20 prominent accounts