Super cool work Masha, congrats!
07.08.2025 22:14 β π 1 π 0 π¬ 0 π 0@ashmrz.bsky.social
Research Scientist @Snap Previously @UofT, @NVIDIAAI, @samsungresearch Opinions are mine. http://ashmrz.github.io
Super cool work Masha, congrats!
07.08.2025 22:14 β π 1 π 0 π¬ 0 π 0I'll be at SIGGRAPH 2025 in Vancouver (Aug 9 - 15)! If you're around and up for some good coffee and/or chats about all things content creation, hit me up βπ¨ #SIGGRAPH2025
05.08.2025 16:01 β π 0 π 0 π¬ 0 π 0Congrats, Kosta! That sounds incredible. Wishing you an amazing year ahead full of great people, new ideas, and exciting experiences.
Have you already taken off or still around for a bit?
Super insightful!
25.06.2025 13:43 β π 1 π 0 π¬ 0 π 0[9/9] 4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation
π Project page: snap-research.github.io/4Real-Video-V2
π Abstract: arxiv.org/abs/2506.18839
[8/9] Authors: Chaoyang Wang*, Ashkan Mirzaei*, Vidit Goel, Willi Menapace, Aliaksandr Siarohin, Avalon Vinella, Michael Vasilkovsky, Ivan Skorokhodov, Vladislav Shakhrai, Sergey Korolev, Sergey Tulyakov, Peter Wonka.
*equal contribution
[7/9] π§ We use a camera token replacement trick for temporal consistency of the camera poses, temporal attention layers to share info over time, and a "Gaussian head" to predict shape, scale, opacity, and color offsets.
24.06.2025 14:13 β π 0 π 0 π¬ 1 π 0[6/9] π How it works β Stage 2 (Reconstruction):
Our feedforward model takes RGB frames and predicts camera poses and dynamic 3D Gaussians. No optimization loops. No ground-truth poses. Just fast, clean reconstruction.
[5/9] β‘ The architecture runs on a DiT backbone. Thanks to sparse attention and temporal compression, we keep things efficient. Only self-attention layers are fine-tuned and everything else is frozen.
24.06.2025 14:13 β π 0 π 0 π¬ 1 π 0[4/9]π§ How it works β Stage 1 (Generation):
We fuse spatial/temporal attentions into a transformer layer. This view-time attention lets our diffusion model reason across viewpoints and frames jointly, without extra parameters. Parameter-efficiency also leads to more stability.
[3/9] High-quality 4D training data is scarce, and large video models are expensive to fine-tune. So we focus on parameter efficiency. Our fused attention design reuses pretrained weights with minimal changes. It trains fast, generalizes well, and scales to full 4D scenes.
24.06.2025 14:13 β π 0 π 0 π¬ 1 π 0[2/9] We generate synchronized multi-view video grids, then lift them into 4D geometry using a fast feedforward network. The result is a set of Gaussian particles, ready for rendering, exploration, and editing.
24.06.2025 14:13 β π 0 π 0 π¬ 1 π 0[1/9] π We introduce 4Real-Video-V2, a method that can generate 4D scenes from a simple text prompt, viewable from any angle at any moment in time. Itβs fast, photorealistic, and works on full scenes. Here's how it works and why it matters. π
snap-research.github.io/4Real-Video-...
In Germany, there is a tradition of creating funny hats for doctoral graduates. π @cvoelcker.bsky.social brought this tradition to my group and, together with Umangi Jain, spearheaded the construction of a masterpiece for our first PhD graduate, @ashmrz.bsky.social. 1/2
28.05.2025 17:40 β π 12 π 2 π¬ 1 π 0Congrats on your election victory, @liberalca.bsky.social and @mark-carney.bsky.social! If I had one wish for Canada's new government, it would be this: stop delaying visas and make it easier for the world's top scientists, including research-focused graduate students, to come to Canada. π
30.04.2025 19:20 β π 2 π 2 π¬ 0 π 0π Have a strong background in 3D/4D Generative Models? Consider applying for an internship with us at Snapβs Creative Vision team! π¨β¨
snap.submittable.com/submit
Calling all PhD students in vision/ML interested in working with a great team (+me!) at GDM doing cutting edge research in 3D computer vision and generative models!
05.03.2025 22:12 β π 26 π 6 π¬ 0 π 0πΉ EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering
Toshiya Yura, @ashmrz.bsky.social, Igor Gilitschenski 5/π§΅
arxiv.org/abs/2412.07293
π Tired of waiting for your Gaussian-based scenes to fit dynamic inputs? β³ Wait no more! Check out our new paper and discover an instant, feed-forward approach! π―β¨
07.12.2024 17:56 β π 2 π 0 π¬ 0 π 0Huge congrats Kostaπ
25.11.2024 19:09 β π 1 π 0 π¬ 1 π 0