𝗣𝗟𝗔𝗡𝗔𝟯𝗥: 𝗭𝗲𝗿𝗼-𝘀𝗵𝗼𝘁 𝗠𝗲𝘁𝗿𝗶𝗰 𝗣𝗹𝗮𝗻𝗮𝗿 𝟯𝗗 𝗥𝗲𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻 𝘃𝗶𝗮 𝗙𝗲𝗲𝗱-𝗙𝗼𝗿𝘄𝗮𝗿𝗱 𝗣𝗹𝗮𝗻𝗮𝗿 𝗦𝗽𝗹𝗮𝘁𝘁𝗶𝗻𝗴
Changkun Liu, Bin Tan, Zeran Ke ... Tristan Braud
arxiv.org/abs/2510.18714
Trending on www.scholar-inbox.com
@mariusm.bsky.social
Ph.D. Student @ University of Freiburg | Research Scientist @ Continental AI Lab
𝗣𝗟𝗔𝗡𝗔𝟯𝗥: 𝗭𝗲𝗿𝗼-𝘀𝗵𝗼𝘁 𝗠𝗲𝘁𝗿𝗶𝗰 𝗣𝗹𝗮𝗻𝗮𝗿 𝟯𝗗 𝗥𝗲𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻 𝘃𝗶𝗮 𝗙𝗲𝗲𝗱-𝗙𝗼𝗿𝘄𝗮𝗿𝗱 𝗣𝗹𝗮𝗻𝗮𝗿 𝗦𝗽𝗹𝗮𝘁𝘁𝗶𝗻𝗴
Changkun Liu, Bin Tan, Zeran Ke ... Tristan Braud
arxiv.org/abs/2510.18714
Trending on www.scholar-inbox.com
MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM
Yuxuan Zhou, Xingxing Li, Shengyu Li, Zhuohao Yan, Chunxi Xia, Shaoquan Feng
tl;dr: MASt3R-SLAM+IMU+GNSS
arxiv.org/abs/2509.20757
𝟰𝗗 𝗗𝗿𝗶𝘃𝗶𝗻𝗴 𝗦𝗰𝗲𝗻𝗲 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗪𝗶𝘁𝗵 𝗦𝘁𝗲𝗿𝗲𝗼 𝗙𝗼𝗿𝗰𝗶𝗻𝗴
Hao Lu, Zhuang Ma, Guangfeng Jiang ... Yingcong Chen
arxiv.org/abs/2509.20251
Trending on www.scholar-inbox.com
RaySt3R was accepted to NeurIPS! Check out the HuggingFace demo for image to 3D in cluttered scenes huggingface.co/spaces/bartd...
19.09.2025 17:28 — 👍 5 🔁 2 💬 0 📌 0𝗠𝗮𝗽𝗔𝗻𝘆𝘁𝗵𝗶𝗻𝗴: 𝗨𝗻𝗶𝘃𝗲𝗿𝘀𝗮𝗹 𝗙𝗲𝗲𝗱-𝗙𝗼𝗿𝘄𝗮𝗿𝗱 𝗠𝗲𝘁𝗿𝗶𝗰 𝟯𝗗 𝗥𝗲𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻
Nikhil Keetha, Norman Müller, Johannes Schönberger ... Peter Kontschieder
arxiv.org/abs/2509.13414
Trending on www.scholar-inbox.com
SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
Jiahao Wang, Yufeng Yuan, Rujie Zheng, Youtian Lin, Jian Gao, Lin-Zhuo Chen, Yajie Bao, Yi Zhang, Chang Zeng, Yanxi Zhou, Xiaoxiao Long, Hao Zhu, Zhaoxiang Zhang, Xun Cao, Yao Yao
tl;dr: in title
arxiv.org/abs/2509.09676
3D and 4D World Modeling: A Survey
tl;dr: in title
arxiv.org/abs/2509.07996
Scaling Transformer-Based Novel View Synthesis Models with Token Disentanglement and Synthetic Data
Nithin Gopalakrishnan Nair, Srinivas Kaza, Xuan Luo, Vishal M. Patel, Stephen Lombardi, Jungyeon Park
arxiv.org/abs/2509.06950
Looking for fully configurable 3D street scene assets and real-time rendered videos? Our latest work generates physically-grounded 3D scenes ideal for robot learning & testing.
Check out our paper + interactive demo: light.princeton.edu/lsd-3d
FastVGGT: Training-Free Acceleration of Visual Geometry Transformer
You Shen, Zhipeng Zhang, Yansong Qu, Liujuan Cao
tl;dr: token merging->VGGT without dense global attention
arxiv.org/abs/2509.02560
𝟯𝗗-𝗟𝗔𝗧𝗧𝗘: 𝗟𝗮𝘁𝗲𝗻𝘁 𝗦𝗽𝗮𝗰𝗲 𝟯𝗗 𝗘𝗱𝗶𝘁𝗶𝗻𝗴 𝗳𝗿𝗼𝗺 𝗧𝗲𝘅𝘁𝘂𝗮𝗹 𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀
Maria Parelli, Michael Oechsle, Michael Niemeyer ... Andreas Geiger
arxiv.org/abs/2509.00269
Trending on www.scholar-inbox.com
𝗖𝗼𝗺𝗽𝗹𝗲𝘁𝗲 𝗚𝗮𝘂𝘀𝘀𝗶𝗮𝗻 𝗦𝗽𝗹𝗮𝘁𝘀 𝗳𝗿𝗼𝗺 𝗮 𝗦𝗶𝗻𝗴𝗹𝗲 𝗜𝗺𝗮𝗴𝗲 𝘄𝗶𝘁𝗵 𝗗𝗲𝗻𝗼𝗶𝘀𝗶𝗻𝗴 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹𝘀
Ziwei Liao, Mohamed Sayed, Steven L. Waslander ... Michael Firman
arxiv.org/abs/2508.21542
Trending on www.scholar-inbox.com
𝗠𝗲𝘀𝗵𝗦𝗽𝗹𝗮𝘁: 𝗚𝗲𝗻𝗲𝗿𝗮𝗹𝗶𝘇𝗮𝗯𝗹𝗲 𝗦𝗽𝗮𝗿𝘀𝗲-𝗩𝗶𝗲𝘄 𝗦𝘂𝗿𝗳𝗮𝗰𝗲 𝗥𝗲𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻 𝘃𝗶𝗮 𝗚𝗮𝘂𝘀𝘀𝗶𝗮𝗻 𝗦𝗽𝗹𝗮𝘁𝘁𝗶𝗻𝗴
Hanzhi Chang, Ruijie Zhu, Wenjie Chang ... Tianzhu Zhang
arxiv.org/abs/2508.17811
Trending on www.scholar-inbox.com
𝗟𝗦𝗗-𝟯𝗗: 𝗟𝗮𝗿𝗴𝗲-𝗦𝗰𝗮𝗹𝗲 𝟯𝗗 𝗗𝗿𝗶𝘃𝗶𝗻𝗴 𝗦𝗰𝗲𝗻𝗲 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆 𝗚𝗿𝗼𝘂𝗻𝗱𝗶𝗻𝗴
Julian Ost, Andrea Ramazzina, Amogh Joshi ... Felix Heide
arxiv.org/abs/2508.19204
Trending on www.scholar-inbox.com
𝟰𝗗𝗡𝗲𝗫: 𝗙𝗲𝗲𝗱-𝗙𝗼𝗿𝘄𝗮𝗿𝗱 𝟰𝗗 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗠𝗼𝗱𝗲𝗹𝗶𝗻𝗴 𝗠𝗮𝗱𝗲 𝗘𝗮𝘀𝘆
Zhaoxi Chen, Tianqi Liu, Long Zhuo ... Ziwei Liu
arxiv.org/abs/2508.13154
Trending on www.scholar-inbox.com
𝗩𝗶𝗣𝗘: 𝗩𝗶𝗱𝗲𝗼 𝗣𝗼𝘀𝗲 𝗘𝗻𝗴𝗶𝗻𝗲 𝗳𝗼𝗿 𝟯𝗗 𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝗶𝗰 𝗣𝗲𝗿𝗰𝗲𝗽𝘁𝗶𝗼𝗻
Jiahui Huang, Qunjie Zhou, Hesam Rabeti ... Sanja Fidler
arxiv.org/abs/2508.10934
Trending on www.scholar-inbox.com
𝗚-𝗖𝗨𝗧𝟯𝗥: 𝗚𝘂𝗶𝗱𝗲𝗱 𝟯𝗗 𝗥𝗲𝗰𝗼𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗖𝗮𝗺𝗲𝗿𝗮 𝗮𝗻𝗱 𝗗𝗲𝗽𝘁𝗵 𝗣𝗿𝗶𝗼𝗿 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻
Ramil Khafizov, Artem Komarichev, Ruslan Rakhimov ... Evgeny Burnaev
arxiv.org/abs/2508.11379
Trending on www.scholar-inbox.com
Mem4D: Decoupling Static and Dynamic Memory for Dynamic Scene Reconstruction
Xudong Cai, Shuo Wang, Peng Wang, Yongcai Wang, Zhaoxin Fan, Wanting Li, Tianbao Zhang, Jianrong Tao, Yeying Jin, Deying Li
tl;dr: 4D cost volumes->dynamic; feature bank->static
arxiv.org/abs/2508.07908
iLRM: An Iterative Large 3D Reconstruction Model
Gyeongjin Kang, Seungtae Nam, Xiangyu Sun, @samehkhamis.bsky.social, Abdelrahman Mohamed, Eunbyung Park
arxiv.org/abs/2507.23277
𝗦𝗮𝗟𝗙: 𝗦𝗽𝗮𝗿𝘀𝗲 𝗟𝗼𝗰𝗮𝗹 𝗙𝗶𝗲𝗹𝗱𝘀 𝗳𝗼𝗿 𝗠𝘂𝗹𝘁𝗶-𝗦𝗲𝗻𝘀𝗼𝗿 𝗥𝗲𝗻𝗱𝗲𝗿𝗶𝗻𝗴 𝗶𝗻 𝗥𝗲𝗮𝗹-𝗧𝗶𝗺𝗲
Yun Chen, Matthew Haines, Jingkang Wang ... Raquel Urtasun
arxiv.org/abs/2507.18713
Trending on www.scholar-inbox.com
Dens3R: A Foundation Model for 3D Geometry Prediction
Xianze Fang, Jingnan Gao, Zhe Wang, Zhuo Chen, Xingyu Ren, Jiangjing Lyu, Qiaomu Ren, Zhonglei Yang, Xiaokang Yang, Yichao Yan, Chengfei Lyu
arxiv.org/abs/2507.16290
$π^𝟯$: 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗣𝗲𝗿𝗺𝘂𝘁𝗮𝘁𝗶𝗼𝗻-𝗘𝗾𝘂𝗶𝘃𝗮𝗿𝗶𝗮𝗻𝘁 𝗩𝗶𝘀𝘂𝗮𝗹 𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴
Yifan Wang, Jianjun Zhou, Haoyi Zhu ... Tong He
arxiv.org/abs/2507.13347
Trending on www.scholar-inbox.com
𝗔𝘂𝘁𝗼𝗣𝗮𝗿𝘁𝗚𝗲𝗻: 𝗔𝘂𝘁𝗼𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 𝟯𝗗 𝗣𝗮𝗿𝘁 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗗𝗶𝘀𝗰𝗼𝘃𝗲𝗿𝘆
Minghao Chen, Jianyuan Wang, Roman Shapovalov ... Andrea Vedaldi
arxiv.org/abs/2507.13346
Trending on www.scholar-inbox.com
SpatialTrackerV2: 3D Point Tracking Made Easy
Yuxi Xiao, @jianyuanwang.bsky.social, Nan Xue, @nikkar.bsky.social, Yuri Makarov, Bingyi Kang, Xing Zhu, Hujun Bao, Yujun Shen, Xiaowei Zhou
tl;dr: DAv2+VGGT->depths & poses->iterative cross-attention-based optimizer
arxiv.org/abs/2507.12462
𝗦𝘁𝗿𝗲𝗮𝗺𝗶𝗻𝗴 𝟰𝗗 𝗩𝗶𝘀𝘂𝗮𝗹 𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿
Dong Zhuo, Wenzhao Zheng, Jiahe Guo ... Jiwen Lu
arxiv.org/abs/2507.11539
Trending on www.scholar-inbox.com
𝗠𝗼𝗩𝗶𝗲𝗦: 𝗠𝗼𝘁𝗶𝗼𝗻-𝗔𝘄𝗮𝗿𝗲 𝟰𝗗 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗩𝗶𝗲𝘄 𝗦𝘆𝗻𝘁𝗵𝗲𝘀𝗶𝘀 𝗶𝗻 𝗢𝗻𝗲 𝗦𝗲𝗰𝗼𝗻𝗱
Chenguo Lin, Yuchen Lin, Panwang Pan ... Yadong Mu
arxiv.org/abs/2507.10065
Trending on www.scholar-inbox.com
𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆 𝗙𝗼𝗿𝗰𝗶𝗻𝗴: 𝗠𝗮𝗿𝗿𝘆𝗶𝗻𝗴 𝗩𝗶𝗱𝗲𝗼 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻 𝗮𝗻𝗱 𝟯𝗗 𝗥𝗲𝗽𝗿𝗲𝘀𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝘁 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝗶𝗻𝗴
Haoyu Wu, Diankun Wu, Tianyu He ... Jiang Bian
arxiv.org/abs/2507.07982
Trending on www.scholar-inbox.com
Scaling 4D Representations
Scaling 4D Representations
Self-supervised learning from video does scale! In our latest work, we scaled masked auto-encoding models to 22B params, boosting performance on pose estimation, tracking & more.
Paper: arxiv.org/abs/2412.15212
Code & models: github.com/google-deepmind/representations4d
🦖 We present “Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion”. #ICCV2025
🌍: visinf.github.io/scenedino/
📃: arxiv.org/abs/2507.06230
🤗: huggingface.co/spaces/jev-a...
@jev-aleks.bsky.social @fwimbauer.bsky.social @olvrhhn.bsky.social @stefanroth.bsky.social @dcremers.bsky.social
𝗚𝗲𝗼𝗺𝗲𝘁𝗿𝘆-𝗮𝘄𝗮𝗿𝗲 𝟰𝗗 𝗩𝗶𝗱𝗲𝗼 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗳𝗼𝗿 𝗥𝗼𝗯𝗼𝘁 𝗠𝗮𝗻𝗶𝗽𝘂𝗹𝗮𝘁𝗶𝗼𝗻
Zeyi Liu, Shuang Li, Eric Cousineau ... Shuran Song
arxiv.org/abs/2507.01099
Trending on www.scholar-inbox.com