Ben Poole benmpoole - Bluesky Statics

A autumnal stump, covered in mushrooms. This is a still from the interactive 3D reconstruction!

I’ve released a new version of my 3D reconstruction tool, Brush 🖌️ It's a big step forward - the quality & speed now match gsplat, and there’s a lot of other new features! See the release notes github.com/ArthurBrusse...

Some of the new features:

30.01.2025 16:25 — 👍 26 🔁 8 💬 2 📌 0

Physics with Veo! #veo2

Prompt: One ball is in the floor. Another ball comes rolling.

16.12.2024 21:22 — 👍 7 🔁 1 💬 3 📌 0

Come learn about CAT3D today at #NeurIPS2024!
talk: 3:30pm West Exhibition Hall C, B3
poster: 4:30pm East Exhibit Hall #1610
remote: cat3d.github.io

we've got stickers 😸

12.12.2024 22:58 — 👍 11 🔁 0 💬 1 📌 0

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler, Dor Verbin, Rundi Wu, Hadi Alzayer, @ruiqigao.bsky.social, @benmpoole.bsky.social, @jonbarron.bsky.social, @holynski.bsky.social, Ravi Ramamoorthi, Pratul P. Srinivasan

11.12.2024 06:14 — 👍 12 🔁 2 💬 1 📌 0

x.com

Information in observed images that predicts what should be in unobserved regions should be captured by these models (e.g. x.com/DotCSV/statu...). They can't create new information, but there are often more statistical hints than we as humans perceive (like the paper you linked).

04.12.2024 20:20 — 👍 0 🔁 0 💬 0 📌 0

A common question nowadays: Which is better, diffusion or flow matching? 🤔

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.

02.12.2024 18:45 — 👍 254 🔁 58 💬 6 📌 7

eh, perception as probabilistic inference is a well established theory of the mind

02.12.2024 10:05 — 👍 1 🔁 0 💬 1 📌 0

This is the essence of visual perception. We don't see pixels, we experience the world behind an image. Our aim is to build AI systems that can achieve this same level of spatial intelligence.

Inferring the whole from fragments requires statistical priors for both humans and machines.

01.12.2024 22:19 — 👍 2 🔁 0 💬 1 📌 0

unlike most work on image/video editing that is not guaranteed to "follow rules" of 3D consistency, our work builds an explicit virtual world that you can move a virtual camera through. we use a statistical learning-based system, but we distill it into a 3D model that follows at least some 3D rules

01.12.2024 21:23 — 👍 3 🔁 0 💬 1 📌 0

Check out CAT4D: our new paper that turns (text, sparse images, videos) => (dynamic 3D scenes)!

I can't get over how cool the interactive demo is.

Try it out for yourself on the project page: cat-4d.github.io

28.11.2024 02:52 — 👍 63 🔁 14 💬 1 📌 1

Stop watching videos, start interacting with worlds.

Stoked to share CAT4D, our new method for turning videos into dynamic 3D scenes that you can move through in real-time!
cat-4d.github.io
arxiv.org/abs/2411.18613

28.11.2024 02:52 — 👍 90 🔁 13 💬 2 📌 5

Posts by Ben Poole (@benmpoole.bsky.social)