Excited to attend #NeurIPS2024 in Vancouver (Dec 9-15)! 🎉
Presenting our work on TripletCLIP & co-organizing a Workshop on #ResponsibleAI.
Let’s meet up and chat about diffusion/flow models and multimodal AI, or say hi! DMs open.🤝
See you there! 🚀
06.12.2024 23:17 — 👍 1 🔁 0 💬 0 📌 0
Finally, this work would not have been possible without the excellent collaborators Song Wen, Dimitris N. Metaxas, and
Yezhou Yang.
We would also like to thank the SCAI ASU, ASU Research Computing, and cr8dl.ai for generous support w.r.t. GPUs.
03.12.2024 21:10 — 👍 0 🔁 0 💬 0 📌 0
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
Resource Efficient Text-to-Image Diffusion Models.
We’re thrilled to release FlowChef on InstaFlow and Flux.1[dev] models for the community to explore and experiment with! 🌟
Project Page: flowchef.github.io
Demo on Flux: huggingface.co/spaces/FlowC...
Demo on InstaFlow: huggingface.co/spaces/FlowC...
Dive in and let us know what you think! ✨
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
We show an extension to 3D and multi-subject editing! 🤯🤯
However, we believe such a straightforward and impactful method could benefit downstream tasks such as video generation. 🚀
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
🎨 Extending FlowChef for Image Editing
We take FlowChef a step further: enabling image editing without performing an inversion of the source image! 🚀
🔥 Even more exciting, this is one of the first approaches to achieve SOTA results on Flux.
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
On inverse problems, our method achieves SOTA performance while being the most efficient approach! 💪
Plus, it’s versatile: seamlessly applicable to both pixel and latent space models. 🤯
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
🎯 Our theoretical insights are backed by empirical observations!
💡 As t → 0, the cosine similarity of gradients for InstaFlow approaches 1️⃣.0️⃣, aligning with our derivations. In contrast, Stable Diffusion gradients behave almost randomly. 📊
Check out the plots below! 👇✨
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
🔍 In toy settings, vector field, and cost gradients seem orthogonal, but this intuition falters in higher-dimensional ODEs (Prop. 4.1).
⚠️ Gradient-based methods need costly backpropagation in ODESolvers.
💡 We prove rectified flows skip it entirely, ensuring convergence (Lem. 4.2, Thm. 4.3). 🚀
03.12.2024 21:10 — 👍 0 🔁 0 💬 1 📌 0
🚨New Paper Alert🚨
🚀 Introducing FlowChef, "Steering Rectified Flow Models in the Vector Field for Controlled Image Generation"! 🌌✨
- Perform image editing, solve inverse problems, and more.
- Achieved inversion-free, gradient-free, & training-free inference time steering! 🤯
👇👇
03.12.2024 21:10 — 👍 5 🔁 2 💬 1 📌 0
It seems that arxiv put the paper on hold. Let’s see how long will it take to get it resolved. 🥲
02.12.2024 04:57 — 👍 0 🔁 0 💬 0 📌 0
Thanks @saxon.me!
29.11.2024 02:01 — 👍 2 🔁 0 💬 0 📌 0
Was all set to drop a new paper on arXiv today, but Thanksgiving got in the way! 🍂
The wait until Sunday will be worth it—can’t wait to share some exciting findings on rectified flow models (especially on Flux).
Stay tuned, and Happy Thanksgiving!
28.11.2024 19:13 — 👍 4 🔁 0 💬 0 📌 2
Hi @csprofkgd.bsky.social, Could you add me as well?
21.11.2024 22:11 — 👍 1 🔁 0 💬 0 📌 0
Hi @gowthami.bsky.social, would appreciate it if you could add me as well.
21.11.2024 15:09 — 👍 0 🔁 0 💬 1 📌 0
🧙🏻♀️ scientist at Meta NYC | http://bamos.github.io
Research Scientist @Toyota Research Institute | Prev. PhD in AI, ML and CV @GeorgiaTech | Researching 3D Perception, Generative AI for Robotics and Multimodal AI
W: https://zubairirshad.com
PhD student. Working on Computer Vision for 3D geometry and semantics.
https://tberriel.github.io
PhD candidate @Jena_DH & @TU_Muenchen working on 3D Reconstruction from Historic Imagery. @TU_Muenchen graduate.
📍 Munich
Research Scientist at Valeo.ai.
https://anhquancao.github.io/
3D Vision | Visiting PhD @ Stanford | ETH AI Center Fellow
elisabettafedele.github.io
📖 Master Student in Robotics at ETH Zürich
🔍 Computer Vision, 3D Reconstruction, Structure-from-Motion, SLAM, Geometry
3D Computer Vision & ML
Research Scientist @Google
Senior Research Scientist @ Niantic
I work on depth estimation, 3D reconstruction, and derivative applications.
masayed.com
PhD Student @ Wisconsin | 3D Vision with Miniature ToF Sensors, Robot Sensing, Computational Imaging
https://cpsiff.github.io
3D Vision Research Scientist @ Huawei Noah's Ark Lab London
Research Scientist at Niantic
Interests: CV, ML, DL, 3D reconstruction, image matching, Gaussian splatting, rendering
Research Intern @GoogleDeepMind || PhD student in Machine Learning, 3D Computer Vision, and Robotics || https://maurocomi.com
PhD student at EPFL
3D Vision | Underwater SLAM | AI for Conservation
http://josauder.github.io
ericzzj1989.github.io
PhD from CUHK. 3D vision, SLAM, SfM, Image Matching (https://github.com/ericzzj1989/Awesome-Image-Matching).
Interested in research on ML for 3D Vision and Graphics. Master student @ UniSaarland
PhD student researching multimodal learning (language, vision, ...).
Also a linguistics enthusiast.
morrisalp.github.io
3D computer vision, deep learning. DL Lead at Crisalix.
Research Fellow @ PUT Mobile Robots Lab
mostly working with robotics perception & motion planning