GitHub - fanegg/Human3R: An unified model for 4D human-scene reconstruction
An unified model for 4D human-scene reconstruction - fanegg/Human3R
Code, model and 4D interactive demo now available
๐Page: fanegg.github.io/Human3R
๐Paper: arxiv.org/abs/2510.06219
๐ปCode: github.com/fanegg/Human3R
Big thanks to our awesome team!
@fanegg.bsky.social @xingyu-chen.bsky.social Yuxuan Xue @apchen.bsky.social @xiuyuliang.bsky.social Gerard Pons-Moll
08.10.2025 08:54 โ
๐ 2
๐ 0
๐ฌ 0
๐ 0
GT comparison shows our feedforward method, without any iterative optimization, is not only fast but also accurate.
This is achieved by reading out humans from a 4D foundation model, #CUT3R, with our proposed ๐๐ช๐ข๐๐ฃ ๐ฅ๐ง๐ค๐ข๐ฅ๐ฉ ๐ฉ๐ช๐ฃ๐๐ฃ๐.
08.10.2025 08:51 โ
๐ 1
๐ 0
๐ฌ 1
๐ 0
Bonus: #Human3R is also a compact human tokenizer!
Our human tokens capture ID+ shape + pose + position of human, unlocking ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด-๐ณ๐ฟ๐ฒ๐ฒ 4D tracking.
08.10.2025 08:50 โ
๐ 2
๐ 0
๐ฌ 1
๐ 0
#Human3R: Everyone Everywhere All at Once
Just input a RGB video, we online reconstruct 4D humans and scene in ๐ข๐ป๐ฒ model and ๐ข๐ป๐ฒ stage.
Training this versatile model is easier than you think โ it just takes ๐ข๐ป๐ฒ day using ๐ข๐ป๐ฒ GPU!
๐Page: fanegg.github.io/Human3R/
08.10.2025 08:49 โ
๐ 2
๐ 0
๐ฌ 1
๐ 1
Again, training-free is all you need.
01.10.2025 07:06 โ
๐ 3
๐ 0
๐ฌ 0
๐ 0
Excited to introduce LoftUp!
A strong (than ever) and lightweight feature upsampler for vision encoders that can boost performance on dense prediction tasks by 20%โ100%!
Easy to plug into models like DINOv2, CLIP, SigLIP โ simple design, big gains. Try it out!
github.com/andrehuang/l...
22.04.2025 07:55 โ
๐ 19
๐ 5
๐ฌ 0
๐ 0
I was really surprised when I saw this. Dust3R has learned very well to segment objects without supervision. This knowledge can be extracted post-hoc, enabling accurate 4D reconstruction instantly.
01.04.2025 18:45 โ
๐ 31
๐ 2
๐ฌ 1
๐ 0
Just "dissect" the cross-attention mechanism of #DUSt3R, making 4D reconstruction easier.
01.04.2025 15:45 โ
๐ 4
๐ 0
๐ฌ 0
๐ 0
#Easi3R is a simple training-free approach adapting DUSt3R for dynamic scenes.
01.04.2025 15:45 โ
๐ 4
๐ 0
๐ฌ 0
๐ 0
YouTube video by Yue Chen
[CVPR 2025] Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
๐ปCode: github.com/fanegg/Feat2GS
๐ฅVideo: youtu.be/4fT5lzcAJqo?...
Big thanks to the amazing team!
@fanegg.bsky.social, @xingyu-chen.bsky.social, Anpei Chen, Gerard Pons-Moll, Yuliang Xiu
#DUSt3R #MASt3R #MiDaS #DINOv2 #DINO #SAM #CLIP #RADIO #MAE #StableDiffusion #Zero123
31.03.2025 16:11 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
Our findings in 3D probe lead to a simple-yet-effective solution, by just combining features from different visual foundation models and outperform prior works.
Apply #Feat2GS in sparse & causal captures:
๐คOnline Demo: huggingface.co/spaces/endle...
31.03.2025 16:08 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
With #Feat2GS we evaluated more than 10 visual foundation models (DUSt3R, DINO, MAE, SAM, CLIP, MiDas, etc) in terms of geometry and texture โ see the paper for comparison.
๐Paper: arxiv.org/abs/2412.09606
๐Try it NOW: fanegg.github.io/Feat2GS/#chart
31.03.2025 16:07 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
How much 3D do visual foundation models (VFMs) know?
Previous work requires 3D data for probing โ expensive to collect!
#Feat2GS @cvprconference.bsky.social 2025 - our idea is to read out 3D Gaussains from VFMs features, thus probe 3D with novel view synthesis.
๐Page: fanegg.github.io/Feat2GS
31.03.2025 16:06 โ
๐ 24
๐ 7
๐ฌ 1
๐ 1