Happy to find that I've been selected as an Outstanding Reviewer for CVPR 2025!
11.05.2025 12:44 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0@haiwen-huang.bsky.social
PhD student in AI at University of Tuebingen. Dreaming for a better world. https://andrehuang.github.io/
Happy to find that I've been selected as an Outstanding Reviewer for CVPR 2025!
11.05.2025 12:44 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0๐ข New paper CVPRโฏ25!
Can meshes capture fuzzy geometry? VolumetricโฏSurfaces uses adaptive textured shells to model hair, furโฏwithout the splatting / volume overhead. Itโs fast, looks great, and runs in real time even on budget phones.
๐ autonomousvision.github.io/volsurfs/
๐ arxiv.org/pdf/2409.02482
โฐ Heads up! The deadline for two #CVPR2025 Autonomous Grand Challenge tracks is May 10th, 2025:
1๏ธโฃ NAVSIM v2 Challenge: huggingface.co/spaces/AGC20...
2๏ธโฃ World Model Challenge by 1X: huggingface.co/spaces/1x-te...
Introducing CaRL: Learning Scalable Planning Policies with Simple Rewards
We show how simple rewards enable scaling up PPO for planning.
CaRL outperforms all prior learning-based approaches on nuPlan Val14 and CARLA longest6 v2, using less inference compute.
arxiv.org/abs/2504.17838
Sometimes you choose aesthetics over aligned maximum at all axes ๐
27.04.2025 03:01 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Loft๐ Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models. We achieve SotA upsampling results for DINOv2. Paper and code:
andrehuang.github.io/loftup-site/
Sharing another video showing how LoftUp significantly improves DINOv2 features! Works like a charm!
Try it out:
Code: github.com/andrehuang/l...
Paper: arxiv.org/abs/2504.14032
Excited to introduce LoftUp!
A strong (than ever) and lightweight feature upsampler for vision encoders that can boost performance on dense prediction tasks by 20%โ100%!
Easy to plug into models like DINOv2, CLIP, SigLIP โ simple design, big gains. Try it out!
github.com/andrehuang/l...
How much 3D do visual foundation models (VFMs) know?
Previous work requires 3D data for probing โ expensive to collect!
#Feat2GS @cvprconference.bsky.social 2025 - our idea is to read out 3D Gaussains from VFMs features, thus probe 3D with novel view synthesis.
๐Page: fanegg.github.io/Feat2GS
๐ฆฃEasi3R: 4D Reconstruction Without Training!
Limited 4D datasets? Take it easy.
#Easi3R adapts #DUSt3R for 4D reconstruction by disentangling and repurposing its attention maps โ make 4D reconstruction easier than ever!
๐Page: easi3r.github.io
๐ Centaur, our first foray into test-time training for end-to-end driving. No retraining needed, just plug-and-play at deployment given a trained model. Also, theoretically nearly no overhead in latency with some clever use of buffers. Surprising how effective this is! arxiv.org/abs/2503.11650
17.03.2025 11:03 โ ๐ 12 ๐ 7 ๐ฌ 1 ๐ 1๐ Names matter! We show that better class names in open-vocabulary segmentation benchmarks greatly improve dataset quality and boost model performance. RENOVATE your dataset labels with our automatic framework! #AI #ComputerVision #NeurIPS24
andrehuang.github.io/renovate/
Synchronization is ubiquitous in nature and a key mechanism for information processing in the brain. We introduce AKOrN as a dynamical alternative to threshold units, which can be combined with MLPs, CNNs or Transformers. ICLR'25 Oral. Project page: takerum.github.io/akorn_projec...
12.02.2025 14:07 โ ๐ 47 ๐ 11 ๐ฌ 2 ๐ 2This week we had our winter retreat jointly with Daniel Cremer's group in Montafon, Austria. 46 talks, 100 Km of slopes and night sledding with some occasionally lost and found. It has been fun!
16.01.2025 17:49 โ ๐ 72 ๐ 11 ๐ฌ 0 ๐ 1