In an era of billion-parameter models everywhere, it's incredibly refreshing to see how a fundamental question can be formulated and solved with simple, beautiful math.
- How should we orient a solar panel βοΈπ? -
Zero AI! If you enjoy math, you'll love this!
Video: www.youtube.com/watch?v=ZKzL...
16.07.2025 14:25 β π 8 π 1 π¬ 0 π 0
*Slides without slide titles*
When I first tried presenting WITHOUT slide titles, everything flowed so much better! (totally validated ... by me)!
Give it a shot! Once you try it, youβll never want to go back.
08.07.2025 11:26 β π 3 π 0 π¬ 0 π 0
*Empty initial slides*
Whatβs a better starting point than that default slide layout?
A completely blank slide.
It helps you explore the design space and focus on delivering a clear, compelling story.
08.07.2025 11:26 β π 1 π 0 π¬ 1 π 0
*Bullet points*
The second thing the layout prompts you to do?
("Click to add text").
Start a bullet list.
Among so many creative forms of presenting your ideas, it nudges you toward the most boring one: a list. π’
08.07.2025 11:26 β π 0 π 0 π¬ 1 π 0
*Slide title*
The first thing this layout does is to ask you to add a slide title.
Seems reasonable, right? visuals, this encourages you to
1) lead your presentation with text instead of visuals and
2) cram in many titles in a talk, making it harder to maintain a narrative flow.
08.07.2025 11:26 β π 0 π 0 π¬ 1 π 0
Why is the "Title and Content" slide layout BAD?
Most people prepare their presentation from this default layout. I used it for years without questioning it.
BUT, this essentially guides you toward developing poor presentation. Why? π€
08.07.2025 11:26 β π 20 π 2 π¬ 5 π 0
Thanks! Yup, I hope to cover some fun computer vision applications. Stay tuned!
02.07.2025 07:35 β π 1 π 0 π¬ 0 π 0
Kidsβ summer camp just kicked off, and that means...
I finally have time to make new videos!
What topics are you most interested in right now?
01.07.2025 09:51 β π 5 π 0 π¬ 1 π 0
Why More Researchers Should be Content Creators
Just trying something new! I recorded one of my recent talks, sharing what I learned from starting as a small content creator.
youtu.be/0W_7tJtGcMI
We all benefit when there are more content creators!
24.06.2025 21:58 β π 7 π 1 π¬ 1 π 0
Fresh out of the oven! π @jbhuang0604.bsky.social breaks down Mean Flow from Kaimingβs group in his latest video.
Video: youtu.be/swKdn-qT47Q?...
19.06.2025 22:24 β π 18 π 2 π¬ 0 π 1
YouTube video by Jia-Bin Huang
Policy Gradient in One Minute
No time? Iβve got your back!
Check out Policy Gradient in One Minute!
youtu.be/p9k9YUdnNlk
Have fun!
20.06.2025 23:08 β π 4 π 0 π¬ 0 π 0
Policy gradient methods rock!
These are the core techniques for making your transformer "chat" and "reason", a robot that manipulates objects, and a drone that maneuvers in a complex environment.
BUT, how do we learn all the developments in the past 30+ years?
20.06.2025 23:08 β π 2 π 1 π¬ 1 π 0
YouTube video by Jia-Bin Huang
One Step, Big Leap: The Simple Idea Transforming Generative AI
Check out the video to learn this new, elegant formulation of generative models!
youtu.be/swKdn-qT47Q
20.06.2025 16:09 β π 12 π 1 π¬ 0 π 0
Awesome! π€©
So glad to hear the authors enjoyed the video, totally made my day!
20.06.2025 16:09 β π 13 π 0 π¬ 1 π 0
We had a blast at CVPR2025!
There was so much to learn! I am particularly excited to meet many new friends and reconnect with old ones.
I feel energized. Already looking forward to the next one!
17.06.2025 14:38 β π 6 π 0 π¬ 0 π 0
Thanks a lot!
04.06.2025 20:01 β π 0 π 0 π¬ 1 π 0
KullbackβLeibler (KL) divergence is a cornerstone of machine learning.
We use it everywhere, from training classifiers and distilling knowledge from models, to learning generative models and aligning LLMs.
BUT, what does it mean, and how do we (actually) compute it?
Video: youtu.be/tXE23653JrU
04.06.2025 14:58 β π 30 π 5 π¬ 1 π 1
My X/Twitter account has been hacked... Please don't believe what they said!
Trying to get it back in the meantime. Sorry for the inconvenience!
03.06.2025 18:11 β π 5 π 0 π¬ 0 π 0
How LLMs Learn to Reason with Reinforcement Learning
Full video: www.youtube.com/watch?v=mg-i...
21.05.2025 18:32 β π 3 π 0 π¬ 0 π 0
Ha! Yes, Seungjae insisted that we call this IVE.
21.05.2025 17:47 β π 1 π 0 π¬ 0 π 0
RL is so back!
Reinforcement learning is a key driver in aligning LLMs and enhancing their reasoning capabilities.
BUT, itβs a tricky topic to wrap your head around (at least for myself π΅βπ«).
So, I put up a video breaking down the basics in a way that clicked for me. I hope it helps you, too!
21.05.2025 17:14 β π 7 π 0 π¬ 1 π 0
I find TRPO's idea of learning from others' experiences fascinating.
So, I started running TRPO for my group, making all (previously individual) feedback on experiments, writing, rebuttals, and presentations public.
Now everyone gets to learn from each otherβs trajectories!
19.05.2025 14:29 β π 2 π 0 π¬ 0 π 0
Indeed!!
14.05.2025 17:18 β π 0 π 0 π¬ 0 π 0
π₯Ί
14.05.2025 13:41 β π 5 π 0 π¬ 2 π 0
Imagine, Verify, Execute: Memory-guided Agentic Exploration with Vision-Language Models
IVE: Imagine, Verify, Execute: Agentic Exploration with Vision-Language Models
Brought to you by our amazing students Seungjae Lee, Daniel Ekpo, Haowen Liu, and faculty Furong Huang and Abhinav Shrivastava
Learn more at ive-robot.github.io
14.05.2025 13:33 β π 2 π 0 π¬ 0 π 0
IVE leverages VLMs to
β’ extract semantic scene graphs,
β’ imagine novel scenes,
β’ predict their physical plausibility, and
β’ generate executable sequences.
IVE is a memory-guided agentic exploration framework that operates fully automatically, enabling more diverse and meaningful exploration.
14.05.2025 13:33 β π 5 π 2 π¬ 1 π 0
Exploration is key for robots to generalize, especially in open-ended environments with vague goals and sparse rewards.
BUT, how do we go beyond random poking? Wouldn't it be great to have a robot that explores an environment just like a kid?
Introducing Imagine, Verify, Execute (IVE)!
14.05.2025 13:33 β π 10 π 2 π¬ 2 π 0
Yup, itβs so much fun!
26.04.2025 22:57 β π 1 π 0 π¬ 0 π 0
Solving high-impact real-world problems with multimodal foundation models
26.04.2025 16:57 β π 2 π 0 π¬ 0 π 0
You are constantly leveling up your presentation!!
23.04.2025 12:39 β π 4 π 0 π¬ 0 π 0
Assistant Professor at UPenn. https://lingjie0206.github.io. Research interests: Neural Scene Representation, Neural Rendering, 3D Reconstruction, Human Performance Modeling and Capture.
Assistant Professor of the Generative Intelligence Lab at Carnegie Mellon University. Understanding and creating pixels. All the code and models are available at http://github.com/junyanz.
Professor at MIT CSAIL, leading the scene representation group (scenerepresentations.com). We are teaching AI to understand the world through perceiving and interacting with it.
Professor, Programmer in NYC.
Cornell, Hugging Face π€
Senior Research Manager at NVIDIA. Prev professor at TUM. Computer vision mostly. Views are my own.
Prof at Georgia Tech
https://faculty.cc.gatech.edu/~judy/
Machine Learning and Computer Vision Researcher
Asst. Prof. UNC Chapel Hill CS
Computer Vision & Graphics.
https://www.cs.unc.edu/~ronisen/
Distinguished Scientist at Google. Computational Imaging, Machine Learning, and Vision. Posts are personal opinions. May change or disappear over time.
http://milanfar.org
Assistant professor @ UMich EECS
Staff Research Scientist at Google - http://sniklaus.com/
Rice University, Associate Professor of Computer Science. Computer Vision, Multimodal AI, Deep Learning. Houston, Texas. Check our work at https://vislang.ai/
Professor for Visual Computing & Artificial Intelligence @TU Munich
Co-Founder @synthesiaIO
Co-Founder @SpAItialAI
https://niessnerlab.org/publications.html
Associate Professor, 3DAI Lab @ TU Munich
https://www.3dunderstanding.org/
Associate Professor of Computer Science at SLU. Computer vision and machine learning. Trying to do a bit of good in the world by looking at pixels.
Director, Max Planck Institute for Intelligent Systems; Chief Scientist Meshcapade; Speaker, Cyber Valley.
Building 3D humans.
https://ps.is.mpg.de/person/black
https://meshcapade.com/
https://scholar.google.com/citations?user=6NjbexEAAAAJ&hl=en&oi=ao
AI researcher at Google DeepMind. Synthesized views are my own.
πSF Bay Area π http://jonbarron.info
This feed is a partial mirror of https://twitter.com/jon_barron
Researcher (OpenAI. Ex: DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian.
Anon feedback: https://admonymous.co/giffmana
π ZΓΌrich, Suisse π http://lucasb.eyer.be
Professor of Computer Vision, @BristolUni. Senior Research Scientist @GoogleDeepMind - passionate about the temporal stream in our lives.
http://dimadamen.github.io
Associate Professor @ Cornell, Computer vision & machine learning