significantly outperforms generic visual pretraining (e.g., DINO-style features) in terms of generalization.
๐https://simongiebenhain.github.io/Pix2NPHM
๐ฅhttps://youtu.be/MgpEJC5p1Ts
Great work by Simon Giebenhain, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Zhe Chen.
23.12.2025 16:30 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
(1) large-scale registration of existing 3D head datasets, and
(2) self-supervised training on vast in-the-wild 2D video datasets using pseudo ground-truth surface normals.
Finally, we show that geometry-aware pretraining on pixel-aligned reconstruction tasks
23.12.2025 16:30 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
Pix2NPHM obtains fast and reliable NPHM reconstructions on real-world data. Inference-time optimization against surface normals and canonical point maps can further increase fidelity.
Key to successful and generalized training of our ViT-based network are:
23.12.2025 16:30 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
Face tracking & 3D reconstruction are often limited by the representational capacity of PCA-based face models. By lifting NPHMs to a first-class reconstruction primitive, we enable more accurate geometry, richer expressions, and finer animation control.
23.12.2025 16:30 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
๐https://peter-kocsis.github.io/IntrinsicImageFusion
๐ฅhttps://youtu.be/-Vs3tR1Xl7k
Great work by Peter Kocsis and Lukas Hollein!
17.12.2025 15:28 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
3) optimize low-dimensional parameters for physically-grounded reconstructions.
The results are relightable PBR textures for 3D scenes: check out the result on a real-world 3D scan from the ScanNet++ dataset!
17.12.2025 15:28 โ
๐ 1
๐ 0
๐ฌ 1
๐ 0
๐ข Intrinsic Image Fusion for Multi-View 3D Material Reconstruction ๐ข
We combine generative material priors with inverse path tracing: 1) define a parametric texture space 2) fuse monocular predictions across views into consistent textures
17.12.2025 15:28 โ
๐ 8
๐ 1
๐ฌ 1
๐ 0
TUM AI Lecture Series - Building generative world models: progress and challenges (Ruiqi Gaoi)
Abstract: Equipping AI models with the ability to imagine, reason, and act in the physical world is a crucial step toward achieving Artificial General Intelligence (AGI). Generative world models, whic...
Today in our TUM AI - Lecture Series we'll have the amazing Ruiqi Gao, Google DeepMind.
She'll talk about "๐๐ฎ๐ข๐ฅ๐๐ข๐ง๐ ๐ ๐๐ง๐๐ซ๐๐ญ๐ข๐ฏ๐ ๐ฐ๐จ๐ซ๐ฅ๐ ๐ฆ๐จ๐๐๐ฅ๐ฌ: progress and challenges".
Live stream: www.youtube.com/live/CkOSMqw...
7pm GMT+1 / 10am PST (Tue Dec 16th).
16.12.2025 10:49 โ
๐ 8
๐ 2
๐ฌ 0
๐ 0
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
We also provide an interactive GUI to enable the exploration of our editing pipeline.
๐ antoniooroz.github.io/PercHead/
๐ฝ๏ธ youtu.be/4hFybgTk4kE
Great work by Antonio Oroz and and Tobias Kirschstein
05.11.2025 11:37 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
by swapping the encoder, we can transform the model into a disentangled 3D editing pipeline. In this scenario, we can control geometry through - potentially hand-drawn - segmentation maps, and condition style via image or text prompt.
05.11.2025 11:37 โ
๐ 1
๐ 0
๐ฌ 1
๐ 0
YouTube video by Matthias Niessner
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
Our trained reconstruction model is able to generate 3D-consistent heads from a single input image. Even with challenging side-view inputs, the model robustly infers missing regions for a coherent, high-fidelity output.
In addition, our architecture seamlessly adapts to downstream tasks:
05.11.2025 11:37 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing
At its core is a generalized 3D head decoder trained with perceptual supervision from DINOv2 and SAM 2.1. We find that our new perceptual loss formulation improves reconstruction fidelity compared to commonly-used methods such as LPIPS.
05.11.2025 11:37 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
๐ข๐ข ๐๐๐ซ๐๐๐๐๐: ๐๐๐ซ๐๐๐ฉ๐ญ๐ฎ๐๐ฅ ๐๐๐๐ ๐๐จ๐๐๐ฅ ๐๐จ๐ซ ๐๐ข๐ง๐ ๐ฅ๐-๐๐ฆ๐๐ ๐ ๐๐ ๐๐๐๐ ๐๐๐๐จ๐ง๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ข๐จ๐ง & ๐๐๐ข๐ญ๐ข๐ง๐ ๐ข๐ข
PercHead reconstructs realistic 3D heads from a single image and enables disentangled 3D editing via geometric controls and style inputs from images or text.
05.11.2025 11:37 โ
๐ 4
๐ 0
๐ฌ 1
๐ 0
#ICCV last week was incredible โ catching up with so many people, chatting about research, and, most importantly, having lots of fun.
Still hard to fathom this privilege as a researcher โ getting to travel to such amazing places and be part of this brilliant community - Thanks!
29.10.2025 12:25 โ
๐ 3
๐ 0
๐ฌ 0
๐ 0
The hot topic at #ICCV2025 was World Models.
They come in different flavors โ (interactive) video models, neural simulators, reconstruction models, etc. โ but the overarching goal is clear: Generative AI that predict and simulate how the real world works.
26.10.2025 15:59 โ
๐ 15
๐ 0
๐ฌ 0
๐ 0
Hawaii on the same scale as the United Kingdom.
24.10.2025 08:33 โ
๐ 3
๐ 0
๐ฌ 0
๐ 0
๐บ๐๐๐๐๐๐ก๐ ๐๐๐๐ ๐ ๐ข๐ โ I generate, therefore I am.
16.10.2025 12:33 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
On the bright side, tooling for training has dramatically improved since then. Deep learning frameworks (PyTorch et. al) and scheduling systems such as SLURM or Kubernetes have become the backbone of modern AI.
12.10.2025 15:46 โ
๐ 1
๐ 0
๐ฌ 1
๐ 0
Given the humongous compute demands of recent generative frontier AI models -- LLMs, image, and video models, etc. --, where compute is measured in Gigawatts, these challenges seem quite amusing.
12.10.2025 15:46 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
The required compute was typically a couple of GPUs on a single desktop machine, trained over several days; e.g., AlexNet was trained on two GTX 580 3GB GPUs for 5-6 days.
12.10.2025 15:46 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
In the 'early days' of modern deep learning (2012-2015) when ConvNets such as AlexNet or VGG came out, it was considered almost impractical to train an ImageNet classifier from scratch.
12.10.2025 15:46 โ
๐ 1
๐ 0
๐ฌ 2
๐ 0
All six of our submissions were accepted to #NeurIPS2025 ๐๐ฅณ
Awesome works about Gaussian Splatting Primitives, Lighting Estimation, Texturing, and much more GenAI :)
Great work by Peter Kocsis, Yujin Chen, Zhening Huang, Jiapeng Tang, Nicolas von Lรผtzow, Jonathan Schmidt ๐ฅ๐ฅ๐ฅ
18.09.2025 16:15 โ
๐ 10
๐ 0
๐ฌ 1
๐ 0
We generate multiple videos along short, pre-defined trajectories that explore the scene in depth. Our scene memory conditions each video on the most relevant prior views while avoiding collisions.
Great work by Manuel Schneider & @LukasHollein
17.09.2025 12:08 โ
๐ 3
๐ 0
๐ฌ 1
๐ 0
Can we use video diffusion to generate 3D scenes?
๐๐จ๐ซ๐ฅ๐๐๐ฑ๐ฉ๐ฅ๐จ๐ซ๐๐ซ (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation.
Text input -> 3DGS scene output & interactive rendering!
๐http://mschneider456.github.io/world-explorer/
๐ฝ๏ธhttps://youtu.be/N6NJsNyiv6I
17.09.2025 12:08 โ
๐ 16
๐ 2
๐ฌ 2
๐ 1
ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions
We further propose a color-based densification and progressive training scheme for improved quality and faster convergence.
shivangi-aneja.github.io/projects/sca...
youtu.be/VyWkgsGdbkk
Great work by Shivangi Aneja, Sebastian Weiss, Irene Baeza Rojo, Prashanth Chandran, Gaspard Zoss, Derek Bradley
05.08.2025 12:30 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions
We operate on patch-based local expression features and increase the representation capacity by synthesizing 3D Gaussians dynamically by leveraging tiny scaffold MLPs conditioned on localized expressions.
05.08.2025 12:30 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions (#SIGGRAPH)
We reconstruct ultra-high fidelity photorealistic 3D avatars capable of generating realistic and high-quality animations including freckles and other fine facial details.
shivangi-aneja.github.io/projects/sca...
05.08.2025 12:30 โ
๐ 6
๐ 0
๐ฌ 1
๐ 0