Xi WANG @xiwang92 - Bluesky Profile

Latest posts by xiwang92.bsky.social on Bluesky

CVPR@Paris opening speech at Sorbonne University by @davidpicard.bsky.social , @vickykalogeiton.bsky.social and Matthieu Cord.

Great location!

❤️

(also: free food as at 'real' CVPR)

06.06.2025 08:02 — 👍 36 🔁 7 💬 0 📌 0

Di[M]O: Distilling Masked Diffusion Models into One-step Generator SOCIAL MEDIA DESCRIPTION TAG TAG

For more details, visit the project website: yuanzhi-zhu.github.io/DiMO/
Or read the paper: arxiv.org/abs/2503.15457
The project is led by Yuanzhi Zhu (yuanzhi-zhu.github.io/about/) and supervised by @stephlat.bsky.social and @vickykalogeiton.bsky.social.

21.03.2025 15:35 — 👍 1 🔁 1 💬 0 📌 0

We test Di[M]O on image generation with MaskGit & Meissonic as teacher models.
- First one-step MDM that competes with multi-step teachers
- A significant speed-up of 8 to 32 times without degradation in quality.
- The first successful distillation approach for text-to-image MDMs.

21.03.2025 15:35 — 👍 0 🔁 0 💬 1 📌 0

Our approach fundamentally differs from previous distillation methods, such as DMD. Instead of minimizing the divergence of denoising distributions across the entire latent space, Di[M]O optimizes the divergence of token-level conditional distributions.

21.03.2025 15:35 — 👍 0 🔁 0 💬 1 📌 0

To approximate the loss gradient, we introduce an auxiliary model that estimates an otherwise intractable term in the loss function. The auxiliary model is trained using a standard MDM training loss, with one-step generated samples as targets.

21.03.2025 15:35 — 👍 0 🔁 0 💬 1 📌 0

To sample from the correct joint distribution, we introduce an initialization that maps a randomized input sequence to an almost deterministic target sequence.
Without proper initialization, the model may suffer from divergence or mode collapse, making this step essential.

21.03.2025 15:35 — 👍 0 🔁 0 💬 1 📌 0

Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms Recent years have seen significant advancements in foundation models through generative pre-training, yet algorithmic innovation in this space has largely stagnated around autoregressive models for di...

The initial distribution is crucial here. As pointed out by
Jiaming Song, in his recent position paper arxiv.org/abs/2503.07154, multi-token prediction is inherently difficult due to the independence assumption between the predicted tokens.

21.03.2025 15:35 — 👍 1 🔁 0 💬 1 📌 0

The key idea is inspired by on-policy distillation. We align the output distributions of the teacher and student models at the student generated intermediate states, ensuring that the student's generation closely matches the teacher's by covering all possible intermediate states.

21.03.2025 15:35 — 👍 1 🔁 0 💬 1 📌 0

Masked Diffusion Models (MDMs) are a hot topic in generative AI 🔥 — powerful but slow due to multiple sampling steps.
We @polytechniqueparis.bsky.social and @inria-grenoble.bsky.social introduce Di[M]O — a novel approach to distill MDMs into a one-step generator without sacrificing quality.

21.03.2025 15:35 — 👍 8 🔁 3 💬 1 📌 0

@xiwang92 is following 20 prominent accounts

Kosta Derpanis
@csprofkgd

#CS Associate Prof York University, #ComputerVision Scientist Samsung #AI, VectorInst Faculty Affiliate, TPAMI AE, ELLIS4Europe Member, #CVPR2025 Publicity Chair on X 📍Toronto 🇨🇦 🔗 csprofkgd.github.io 🗓️ Joined Nov 2024

Diane Larlus
@dlarlus

Computer Vision & Machine Learning researcher at NAVER LABS europe she/her - https://dlarlus.github.io/

Johannes Lutzeyer
@jlutzeyer

Assistant Professor @ École Polytechnique, IP Paris | Interested in GNNs and Graph Representation Learning | Website: johanneslutzeyer.com

Anand Bhattad
@anandbhattad

Incoming Assistant Professor at Johns Hopkins University | RAP at Toyota Technological Institute at Chicago | web: https://anandbhattad.github.io/ | Knowledge in Generative Image Models, Intrinsic Images, Image-based Relighting, Inverse Graphics

Rob Erdmann
@erdmann

Full Prof. at University of Amsterdam On a mission to help the world access, conserve, and understand its cultural heritage.

Sébastien Darses
@sebdarses

Math Assoc. Prof. (On leave, Aix-Marseille, France) Teaching Project (non-profit): https://highcolle.com/

Bingchen Gong
@s2.hk

My bio: s2.hk

NeurIPS Conference
@neuripsconf

San Diego Dec 2-7, 25 and Mexico City Nov 30-Dec 5, 25. Comments to this account are not monitored. Please send feedback to townhall@neurips.cc.

Vincent Sitzmann
@vincentsitzmann

Professor at MIT CSAIL, leading the scene representation group (scenerepresentations.com). We are teaching AI to understand the world through perceiving and interacting with it.

Andrew Owens
@andrewowens

Associate professor @ Cornell Tech

Francis Bach
@bachfrancis

Researcher in machine learning

Victoria Fernandez Abrevaya
@vabrevaya

Research scientist @ MPI-IS Tübingen.

Vincent Lepetit
@vincentlepetit

Gül Varol
@gulvarol

Research faculty @ImagineENPC. https://gulvarol.github.io/

Stéphane Lathuilière
@stephlat

Research Scientist at Inria Grenoble #ComputerVision Image/video generation. Recruiting PhD students, postdocs and engineers. Stelat.eu

Eric Marchand 🤖
@ericmarchand

Prof. of Computer Science, Robot Vision IRISA, Université de Rennes

ELLIS unit Jena
@ellisunitjena

https://ellis-jena.eu is developing+applying #AI #ML in #earth system, #climate & #environmental research. Partner: @uni-jena.de, https://bgc-jena.mpg.de/en, @dlr-spaceagency.bsky.social, @carlzeissstiftung.bsky.social, https://aiforgood.itu.int

Christian Wolf
@chriswolfvision

Principal Scientist at Naver Labs Europe, Lead of Spatial AI team. AI for Robotics, Computer Vision, Machine Learning. Austrian in France. https://chriswolfvision.github.io/www/

Angela Dai
@adai

Associate Professor, 3DAI Lab @ TU Munich https://www.3dunderstanding.org/

Jun-Yan Zhu
@junyanz

Assistant Professor of the Generative Intelligence Lab at Carnegie Mellon University. Understanding and creating pixels. All the code and models are available at http://github.com/junyanz.