Alexandre Morgand, PhD's Avatar

Alexandre Morgand, PhD

@alexmrgd.bsky.social

Computer Vision Research Scientist at @Simulon , music lover, fond of scientific/musical/geeky/useless stuff

45 Followers  |  33 Following  |  137 Posts  |  Joined: 17.10.2024
Posts Following

Posts by Alexandre Morgand, PhD (@alexmrgd.bsky.social)

This is how 3DGS is getting mainstream

16.02.2026 17:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

Ranran Huang, Krystian Mikolajczyk from Imperial College London

Project page: ranrhuang.github.io/spfsplat/
Paper: arxiv.org/abs/2508.01171
Source code (coming soon): github.com/ranrhuang/SP...

07.08.2025 15:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

"No Pose at All Self-Supervised Pose-Free 3DGS from Sparse Views"
TLDR: 3DGS + no poses during training/inference; shared feature extraction backbone; simultaneous prediction of 3D Gaussian primitives+camera poses in a canonical space from unposed (1 feed-forward step).

07.08.2025 15:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion a novel one-step video bokeh framework that converts arbitrary input videos into temporally coherent, depth-aware bokeh effects.

Yang Yang 1,2, Siming Zheng 2, Jinwei Chen 2, Boxi Wu 1, Xiaofei He 1, Deng Cai 1, Bo Li 2, Peng-Tao Jiang 2

1 Zhejiang University
2 VIVO MOBILE Communication Co., Ltd

Project page: vivocameraresearch.github.io/any2bokeh/
Paper: arxiv.org/abs/2505.21593
Source code: github.com/vivoCameraRe...

13.06.2025 13:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

"Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion"

πŸ“–TL;DR: Any-to-Bokeh is a novel one-step video bokeh framework that converts arbitrary input videos into temporally coherent, depth-aware bokeh effects.

13.06.2025 13:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@sharathgirish97 1,2 @_TianyeLi 2* Amrita Mazumdar 2* @abhi2610 2 @davedotluebke 2 @shalinidemello 2

1 @umdglobalcampus
2 @nvidia

*Equal contributions

11.06.2025 09:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos

Project page: research.nvidia.com/labs/amri/pr...
Paper: openreview.net/pdf?id=7xhwE...
Source code (released! NVIDIA license - non commercial): github.com/nvlabs/queen

11.06.2025 09:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

"QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos"

TL;DR: Streamable free-viewpoint videos efficient representations for with dynamic Gaussians. Reduce model size to just 0.7 MB per frame while training in < 5s and rendering at 350 FPS

11.06.2025 09:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Jiawei Yang *,ΒΆ, Jiahui Huang ΒΆ, Yuxiao Chen ΒΆ,
Yan Wang ΒΆ, Boyi Li ΒΆ, Yurong You ΒΆ, Maximilian Igl ΒΆ, Apoorva Sharma ΒΆ, Peter Karkus ΒΆ, Danfei Xu $,ΒΆ, Boris Ivanovic ΒΆ, Yue Wang †,*,ΒΆ Marco Pavone †,Β§,ΒΆ

* University of Southern California
$ GIT
Β§ Stanford University
ΒΆ NVIDIA

† Equal advising

20.05.2025 16:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
STORM

Project page: jiawei-yang.github.io/STORM/
Paper: arxiv.org/abs/2501.00602
Source code: github.com/NVlabs/Gauss...

20.05.2025 16:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

"STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes"

TL;DR: Data driven transformer in a feed forward manner; dense reconstruction in dynamic environment with 3D gaussians and velocities; self-supervised scene flows

20.05.2025 16:46 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

@ucberkeleyofficial.bsky.social
, Max Planck Institute for Intelligent Systems, Stanford University

Project page: st4rtrack.github.io
Paper: st4rtrack.github.io/files/St4RTr...
Results: st4rtrack.github.io/page1.html

22.04.2025 16:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

(2/2) capturing static+dynamic scene geometry while maintaining 3D correspondences; long-range correspondences, effectively combining 3D reconstruction with 3D tracking; re-projection loss.

22.04.2025 16:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

TL;DR: a feed-forward; (reconstructs+tracks dynamic video content); dust3r-like pointmaps for a pair of frames captured at different moments (1/2)

22.04.2025 16:30 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Shangzhan Zhang 1,2*, @jianyuan_wang
3*, @YinghaoXu1
4*†, Nan Xue 2, Christian Rupprecht 3, @XiaoweiZhou5
1†, Yujun Shen 2, @GordonWetzstein 4

1 Zhejiang University
2 AntGroup
3 University of Oxford
4 Stanford University

*, † equal contributions (?)

14.03.2025 10:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
FLARE FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views

Project page: zhanghe3z.github.io/FLARE/
Paper: arxiv.org/pdf/2502.12138
Source code: github.com/ant-research...
Demo: huggingface.co/spaces/zhang...

14.03.2025 10:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views

TL;DR: feed-forward model; cascaded learning paradigm with camera pose serving as the critical bridge, recognizing its essential role in mapping 3D structures onto 2D image planes.

14.03.2025 10:21 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass Fast3R dramatically improves 3D reconstruction speed by processing up to 1500 images in a single forward pass.

Project page: fast3r-3d.github.io
Demo: fast3r.ngrok.app
Source code: github.com/facebookrese...

13.03.2025 10:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

⚑️Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

TL;DR: multi-view generalization to DUSt3R; processing many views in parallel: Transformer-based architecture forwards N images in a single forward pass, bypassing the need for iterative alignment.

13.03.2025 10:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
VACE All-in-One Video Creation and Editing

Project page: ali-vilab.github.io/VACE-Page/
Paper: arxiv.org/pdf/2503.07598

12.03.2025 08:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

πŸͺ„ VACE: All-in-One Video Creation and Editing

from @alibabagroup.bsky.social's Tongyi Lab with:

Zeyinzi Jiang* Zhen Han* Chaojie Mao*† Jingfeng Zhang Yulin Pan Yu Liu

*Equal contribution, †Project lead

12.03.2025 08:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

From @nvidia (1), @NUSingapore (2), @UofT (3) and @VectorInst (4)

@jayzhangjiewu 1,2*, Yuxuan Zhang 1*, Haithem Turki 1, Xuanchi Ren 1,3,4, @JunGao33210520 1,3,4, Mike Zheng Shou 2, @FidlerSanja 1,3,4, @ZGojcic 1†, @HuanLing6 1,3,4†

*, † equal contribution

10.03.2025 08:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Project page: research.nvidia.com/labs/toronto...
Paper: arxiv.org/abs/2503.01774

10.03.2025 08:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

TL;DR: single-step diffusion models; a single-step image diffusion model trained to enhance and remove artifacts in rendered novel views caused by underconstrained regions of the 3D representation.

10.03.2025 08:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
DAM4SAM Project page of the paper: A Distractor-Aware Memory for Visual Object Tracking with SAM2

Authors: Jovana VidenoviΔ‡ , Alan LukeΕΎič , Matej Kristan
from Faculty of Computer and Information Science, University of Ljubljana

Project page: jovanavidenovic.github.io/dam-4-sam/
Paper: arxiv.org/abs/2411.17576
Source code: github.com/jovanavideno...

05.03.2025 08:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

A Distractor-Aware Memory (DAM) for Visual Object Tracking with SAM2

TL;DR: SAM2.1 based; distractor-distilled (DiDi) dataset to better study the distractor problem

05.03.2025 08:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
CAST Abstract Recovering high-quality 3D scenes from a single RGB image is a challenging task in computer graphics. Current methods often struggle with domain-specific limitations or low-quality object ge...

Project page: sites.google.com/view/cast4
Paper: arxiv.org/pdf/2502.12894
Youtube video: www.youtube.com/watch?v=cloV...
Planned to be open sourced

04.03.2025 08:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image

TL;DR: object-level 2D segmentation+relative depth; GPT-based model to analyze inter-object spatial relationships; occlusion-aware large-scale 3D generation model

04.03.2025 08:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
SOCIAL MEDIA TITLE TAG SOCIAL MEDIA DESCRIPTION TAG TAG

Project page: alviur.github.io/color-illusi...
Paper: arxiv.org/abs/2412.10122
Planned to be released on github!

28.02.2025 15:16 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Are diffusion models falling for optical illusion?

"The Art of Deception: Color Visual Illusions and Diffusion Models"

TL;DR: Diffusion models exhibit human-like perceptual shifts in brightness and color within their latent space.

28.02.2025 15:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0