A autumnal stump, covered in mushrooms. This is a still from the interactive 3D reconstruction!
Iβve released a new version of my 3D reconstruction tool, Brush ποΈ It's a big step forward - the quality & speed now match gsplat, and thereβs a lot of other new features! See the release notes github.com/ArthurBrusse...
Some of the new features:
30.01.2025 16:25 β π 25 π 8 π¬ 2 π 0
Physics with Veo! #veo2
Prompt: One ball is in the floor. Another ball comes rolling.
16.12.2024 21:22 β π 7 π 1 π¬ 3 π 0
Come learn about CAT3D today at #NeurIPS2024!
talk: 3:30pm West Exhibition Hall C, B3
poster: 4:30pm East Exhibit Hall #1610
remote: cat3d.github.io
we've got stickers πΈ
12.12.2024 22:58 β π 11 π 0 π¬ 1 π 0
SimVS: Simulating World Inconsistencies for Robust View Synthesis
Alex Trevithick, Roni Paiss, Philipp Henzler, Dor Verbin, Rundi Wu, Hadi Alzayer, @ruiqigao.bsky.social, @benmpoole.bsky.social, @jonbarron.bsky.social, @holynski.bsky.social, Ravi Ramamoorthi, Pratul P. Srinivasan
11.12.2024 06:14 β π 11 π 2 π¬ 1 π 0
x.com
Information in observed images that predicts what should be in unobserved regions should be captured by these models (e.g. x.com/DotCSV/statu...). They can't create new information, but there are often more statistical hints than we as humans perceive (like the paper you linked).
04.12.2024 20:20 β π 0 π 0 π¬ 0 π 0
A common question nowadays: Which is better, diffusion or flow matching? π€
Our answer: Theyβre two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. Thatβs great: It means you can use them interchangeably.
02.12.2024 18:45 β π 254 π 58 π¬ 6 π 7
eh, perception as probabilistic inference is a well established theory of the mind
02.12.2024 10:05 β π 1 π 0 π¬ 1 π 0
This is the essence of visual perception. We don't see pixels, we experience the world behind an image. Our aim is to build AI systems that can achieve this same level of spatial intelligence.
Inferring the whole from fragments requires statistical priors for both humans and machines.
01.12.2024 22:19 β π 2 π 0 π¬ 1 π 0
unlike most work on image/video editing that is not guaranteed to "follow rules" of 3D consistency, our work builds an explicit virtual world that you can move a virtual camera through. we use a statistical learning-based system, but we distill it into a 3D model that follows at least some 3D rules
01.12.2024 21:23 β π 3 π 0 π¬ 1 π 0
Check out CAT4D: our new paper that turns (text, sparse images, videos) => (dynamic 3D scenes)!
I can't get over how cool the interactive demo is.
Try it out for yourself on the project page: cat-4d.github.io
28.11.2024 02:52 β π 63 π 14 π¬ 1 π 1
Stop watching videos, start interacting with worlds.
Stoked to share CAT4D, our new method for turning videos into dynamic 3D scenes that you can move through in real-time!
cat-4d.github.io
arxiv.org/abs/2411.18613
28.11.2024 02:52 β π 90 π 13 π¬ 2 π 5
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
AI researcher at Google DeepMind. Synthesized views are my own.
πSF Bay Area π http://jonbarron.info
This feed is a partial mirror of https://twitter.com/jon_barron
Digital Cultures and Arts | UZH & ZHdK | operative images, synthetic media and visual culture
@digitalculturesandarts.ch
https://digitalculturesandarts.ch/
https://linktr.ee/bildoperationen
ML Engineer at NVIDIA. Previously: Stealth GPU startup; Stability AI; AMD; Autodesk; CEO of 2 startups (3D + AI). Toronto, Canada
Research Scientist @ Google DeepMind | 3D Computer Vision & Machine Learning
Design lead @ Google Labs π§ͺ
βββ prev βββ
β¨YouTube GenAI & creator tools
πͺGoogle ATAP
πGoogle Home/Nest
π Google Wifi & IoT
AI + security | Stanford PhD in AI & Cambridge physics | techno-optimism + alignment + progress + growth | πΊπΈπ¨πΏ
Always pondering startups, ML, Rust, Python, and 3D printing.
Independent ML researcher consulting on LMs + data.
Previously: Salesforce Research, MetaMind, CommonCrawl, Harvard. π¦πΊ in SF. He/him.
Personal blog: https://state.smerity.com
Research Director at Google DeepMind
Generative Music Co-lead
g.co/magenta
building the future
research at midjourney, deepmind. slinging ai hot takes π₯at artfintel.com
AI @ OpenAI, Tesla, Stanford
SVP of Open-Endedness at Lila Sciences. In the past: Maven CEO, Lead at OpenAI, head of basic/core research at Uber AI, professor at UCF.
Stuff I helped invent: NEAT, CPPNs, HyperNEAT, novelty search, POET, Picbreeder.
Book: Why Greatness Cannot Be Plann
Founder & executive & community builder & organizer & researcher
ML Collective (mlcollective.org)
Google DeepMind
rosanneliu.com
professor at university of washington and founder at csm.ai. computational cognitive scientist. working on social and artificial intelligence and alignment.
http://faculty.washington.edu/maxkw/
I work at Sakana AI ππ π‘ β @sakanaai.bsky.social
https://sakana.ai/careers