πBig thanks to @danielbarath.bsky.social @andreasgeiger.bsky.social and @marcpollefeys.bsky.social
for a great collaboration on this project!
πBig thanks to @danielbarath.bsky.social @andreasgeiger.bsky.social and @marcpollefeys.bsky.social
for a great collaboration on this project!
πReSplat achieves state-of-the-art performance on DL3DV and RealEstate10K, across various input views (2, 8, 16) and resolutions (256x256-540Γ960).
Check out our paper for detailed results: arxiv.org/abs/2510.08575
π§ To initialize the recurrent process, we design a compact reconstruction model that operates in a 16x subsampled space, producing 16x fewer Gaussians than previous pixel-aligned models. This substantially reduces computational overhead and allows for efficient Gaussian updates.
10.10.2025 20:12 β π 1 π 0 π¬ 1 π 0π―Key idea: the Gaussian splatting rendering error provides a rich feedback signal, guiding the recurrent network to learn effective Gaussian updates. This feedback naturally adapts to unseen data, enabling robust generalization across datasets and resolutions.
10.10.2025 20:12 β π 1 π 0 π¬ 1 π 0β‘Feed-forward Gaussian splatting is fast but limited: it only makes a single forward pass. ReSplat introduces recurrent refinement, enabling the model to iteratively improve the 3D Gaussians. ReSplat converges fast, requiring only three iterations.
10.10.2025 20:12 β π 1 π 0 π¬ 1 π 0
πExcited to share our recent work on test-time scaling for feed-forward Gaussian splatting:
we learn a recurrent model ReSplat that is able to iteratively improve the reconstruction quality in a feed-forward manner!
haofeixu.github.io/resplat/
Check out Frano's amazing work on multi-view 3D point tracking! π Code, models, datasets, and interactive results β all available!
29.08.2025 15:20 β π 2 π 0 π¬ 0 π 0
π Introducing our new paper, MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models.
π Paper: www.scholar-inbox.com/papers/He202...
arxiv.org/pdf/2508.13148
π» Code: github.com/autonomousvi...
π Project Page: cli212.github.io/MDPO/
Project page: chengzhag.github.io/publication/...
Code: github.com/chengzhag/Pa...
Interact with the scene in this video (best viewed on a desktop browser or Youtube app): www.youtube.com/watch?v=9bKZ...
Wanna scale your feed-forward Gaussian Splatting model to 4K resolution? Come check out our #CVPR2025 poster PanSplat today 10:30β12:30 (June 14) at ExHall D, Poster #74!
14.06.2025 05:06 β π 4 π 0 π¬ 1 π 0Kudos to my amazing co-authors: @songyoupeng.bsky.social @fangjinhuawang.bsky.social @hermannblum.bsky.social @danielbarath.bsky.social @andreasgeiger.bsky.social @marcpollefeys.bsky.social !
05.06.2025 12:26 β π 2 π 0 π¬ 0 π 0Catch us on Saturday, June 14 at 5 PM, ExHall D Poster #58!
05.06.2025 12:21 β π 0 π 0 π¬ 0 π 0
Your personalized CVPR 25 @cvprconference.bsky.social conference programs are now available for you!
www.scholar-inbox.com/conference/c...
DepthSplat: Connecting Gaussian Splatting and Depth
Project page: haofeixu.github.io/depthsplat/
Code, models, data: github.com/cvg/depthsplat
Excited to present our #CVPR2025 paper DepthSplat next week!
DepthSplat is a feed-forward model that achieves high-quality Gaussian reconstruction and view synthesis in just 0.6 seconds.
Looking forward to great conversations at the conference!
π£ Excited to share our #CVPR2025 Spotlight paper and my internship project @wayve: SimLingo.
A Vision-Language-Action (VLA) model that achieves state-of-the-art driving performance with language capabilities.
Code: github.com/RenzKa/simli...
Paper: arxiv.org/abs/2503.09594
π’ New paper CVPRβ―25!
Can meshes capture fuzzy geometry? Volumetricβ―Surfaces uses adaptive textured shells to model hair, furβ―without the splatting / volume overhead. Itβs fast, looks great, and runs in real time even on budget phones.
π autonomousvision.github.io/volsurfs/
π arxiv.org/pdf/2409.02482
Introducing CaRL: Learning Scalable Planning Policies with Simple Rewards
We show how simple rewards enable scaling up PPO for planning.
CaRL outperforms all prior learning-based approaches on nuPlan Val14 and CARLA longest6 v2, using less inference compute.
arxiv.org/abs/2504.17838
π Introducing DepthSplat: a framework that connects Gaussian splatting with single- and multi-view depth estimation. This enables robust depth modeling and high-quality view synthesis with state-of-the-art results on ScanNet, RealEstate10K, and DL3DV.
π haofeixu.github.io/depthsplat/
Personal programs for ICLR 25 @iclr-conf.bsky.social are now available at www.scholar-inbox.com. Enjoy!
23.04.2025 21:07 β π 27 π 4 π¬ 0 π 0
Excited to introduce LoftUp!
A strong (than ever) and lightweight feature upsampler for vision encoders that can boost performance on dense prediction tasks by 20%β100%!
Easy to plug into models like DINOv2, CLIP, SigLIP β simple design, big gains. Try it out!
github.com/andrehuang/l...
π Centaur, our first foray into test-time training for end-to-end driving. No retraining needed, just plug-and-play at deployment given a trained model. Also, theoretically nearly no overhead in latency with some clever use of buffers. Surprising how effective this is! arxiv.org/abs/2503.11650
17.03.2025 11:03 β π 12 π 7 π¬ 1 π 1This week we had our winter retreat jointly with Daniel Cremer's group in Montafon, Austria. 46 talks, 100 Km of slopes and night sledding with some occasionally lost and found. It has been fun!
16.01.2025 17:49 β π 72 π 11 π¬ 0 π 1