Thanks Zhenjun for sharing!
25.04.2025 12:21 β π 1 π 0 π¬ 1 π 0Thanks Zhenjun for sharing!
25.04.2025 12:21 β π 1 π 0 π¬ 1 π 0
Late to post, but excited to introduce CUT3R!
An online 3D reasoning framework for many 3D tasks directly from just RGB. For static or dynamic scenes. Video or image collections, all in one!
Project Page: cut3r.github.io
Code and Model: github.com/CUT3R/CUT3R
π€Can Generative Video Models Help Pose Estimation?
β
Yes!
We find that generative video models can hallucinate plausible intermediate frames that provide useful context for pose estimators (e.g. DUSt3R), especially for images with little to no overlap.
π inter-pose.github.io
Introducing πStereo4Dπ
A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories.
We used Stereo4D to make a dataset of over 100k real-world 4D scenes.
Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features
Yuanbo Xiangli, Ruojin Cai, Hanyu Chen, Jeffrey Byrne,
@snavely.bsky.social
tl;dr: new dataset (55K pairs) + Mast3r == PROFIT
arxiv.org/abs/2412.05826