Ben Poole's Avatar

Ben Poole

@benmpoole.bsky.social

research scientist at google deepmind. phd in neural nonsense from stanford. poolio.github.io

1,815 Followers  |  231 Following  |  6 Posts  |  Joined: 21.11.2024  |  1.8698

Latest posts by benmpoole.bsky.social on Bluesky

A autumnal stump, covered in mushrooms. This is a still from the interactive 3D reconstruction!

A autumnal stump, covered in mushrooms. This is a still from the interactive 3D reconstruction!

I’ve released a new version of my 3D reconstruction tool, Brush πŸ–ŒοΈ It's a big step forward - the quality & speed now match gsplat, and there’s a lot of other new features! See the release notes github.com/ArthurBrusse...

Some of the new features:

30.01.2025 16:25 β€” πŸ‘ 25    πŸ” 8    πŸ’¬ 2    πŸ“Œ 0
Video thumbnail

Physics with Veo! #veo2

Prompt: One ball is in the floor. Another ball comes rolling.

16.12.2024 21:22 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 3    πŸ“Œ 0
Post image

Come learn about CAT3D today at #NeurIPS2024!
talk: 3:30pm West Exhibition Hall C, B3
poster: 4:30pm East Exhibit Hall #1610
remote: cat3d.github.io

we've got stickers 😸

12.12.2024 22:58 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

SimVS: Simulating World Inconsistencies for Robust View Synthesis

Alex Trevithick, Roni Paiss, Philipp Henzler, Dor Verbin, Rundi Wu, Hadi Alzayer, @ruiqigao.bsky.social, @benmpoole.bsky.social, @jonbarron.bsky.social, @holynski.bsky.social, Ravi Ramamoorthi, Pratul P. Srinivasan

11.12.2024 06:14 β€” πŸ‘ 11    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
x.com

Information in observed images that predicts what should be in unobserved regions should be captured by these models (e.g. x.com/DotCSV/statu...). They can't create new information, but there are often more statistical hints than we as humans perceive (like the paper you linked).

04.12.2024 20:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

A common question nowadays: Which is better, diffusion or flow matching? πŸ€”

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.

02.12.2024 18:45 β€” πŸ‘ 254    πŸ” 58    πŸ’¬ 6    πŸ“Œ 7

eh, perception as probabilistic inference is a well established theory of the mind

02.12.2024 10:05 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This is the essence of visual perception. We don't see pixels, we experience the world behind an image. Our aim is to build AI systems that can achieve this same level of spatial intelligence.

Inferring the whole from fragments requires statistical priors for both humans and machines.

01.12.2024 22:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

unlike most work on image/video editing that is not guaranteed to "follow rules" of 3D consistency, our work builds an explicit virtual world that you can move a virtual camera through. we use a statistical learning-based system, but we distill it into a 3D model that follows at least some 3D rules

01.12.2024 21:23 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Check out CAT4D: our new paper that turns (text, sparse images, videos) => (dynamic 3D scenes)!

I can't get over how cool the interactive demo is.

Try it out for yourself on the project page: cat-4d.github.io

28.11.2024 02:52 β€” πŸ‘ 63    πŸ” 14    πŸ’¬ 1    πŸ“Œ 1
Video thumbnail

Stop watching videos, start interacting with worlds.

Stoked to share CAT4D, our new method for turning videos into dynamic 3D scenes that you can move through in real-time!
cat-4d.github.io
arxiv.org/abs/2411.18613

28.11.2024 02:52 β€” πŸ‘ 90    πŸ” 13    πŸ’¬ 2    πŸ“Œ 5

@benmpoole is following 19 prominent accounts