Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)?
We have been pondering this during summer and developed a new model: JetFormer ππ€
arxiv.org/abs/2411.19722
A thread π
1/
02.12.2024 16:41 β π 155 π 36 π¬ 4 π 7
SfM failing on dynamic videos? π RoMo to the rescue! πͺ Our simple method uses epipolar cues and semantic features for robustly estimating motion masks, boosting dynamic SfM performance π Plus, a new dataset of dynamic scenes with ground truth cameras! π€― #computervision
π§΅π
03.12.2024 06:05 β π 2 π 0 π¬ 0 π 0
Research Scientist @GoogleDeepMind. Representation learning for multimodal understanding and generation.
mitscha.github.io
Blog: https://sander.ai/
π¦: https://x.com/sedielem
Research Scientist at Google DeepMind (WaveNet, Imagen 3, Veo, ...). I tweet about deep learning (research + software), music, generative models (personal account).
Recently a principal scientist at Google DeepMind. Joining Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamical systems.
AI/generative artist. Writes her own code. Absolute power is a door into dreaming.
works @ runway in π½
www.ethanrosenthal.com
Internet pedestrian. β¨Content creatorβ¨ Machine learning mercenary. α(γ)α (he/him/his)
https://laurent-dinh.github.io/
Machine Learning Researcher
https://alexalemi.com
https://blog.alexalemi.com
I do SciML + open source!
π§ͺ ML+proteins @ http://Cradle.bio
π Neural ODEs: http://arxiv.org/abs/2202.02435
π€ JAX ecosystem: http://github.com/patrick-kidger
π§βπ» Prev. Google, Oxford
π ZΓΌrich, Switzerland
Research scientist at FAIR NY β€οΈ Machine Learning + Information Theory. Previously, PhD at UoAmsterdam, intern at DeepMind + MSRC.
http://blog.christianperone.com, Machine Learning, Computer Science and Math. Staff ML Research Engineer working with imitation learning and planning for Autonomous Vehicles. London/UK.
Mostly: ML for music production workflows.
Professor of Physics & Senior Data Fellow at Belmont University, Nashville TN
Head of Research for Hyperstate Music AI.
Teacher of audio engineers, Opinions my own.
Explainer blog: https://drscotthawley.github.io
senior research scientist at Google | author of DreamBooth
https://natanielruiz.github.io/
Post-doc @UniofOxford w/ @mmbronstein.bsky.social. Into Geometry β© Generative Models. @mila-quebec.bsky.social Affiliate member. Phd from @mila-quebec.bsky.social / McGill.
website: https://joeybose.github.io/
Associate Professor in EECS at MIT. Neural nets, generative models, representation learning, computer vision, robotics, cog sci, AI.
https://web.mit.edu/phillipi/
Generative AI and computer graphics at Aalto University & NVIDIA Research. @ellis.eu Fellow. https://users.aalto.fi/~lehtinj7
Ph.D. student on generative models and domain adaptation for Earth observation π°
Previously intern @SonyCSL, @Ircam, @Inria
π Personal website: https://lebellig.github.io/
AI Researcher at the Samsung SAIT AI Lab π±βπ»
I build generative models for images, videos, text, tabular data, NN weights, molecules, and now video games!
Research Scientist at Google DeepMind. Working on Gemini reasoning models.
PhD from UofT and Vector Institute
www.paulvicol.com
ML Research @ Apple.
Understanding deep learning (generalization, calibration, diffusion, etc).
preetum.nakkiran.org