Eric Schmidt got a standing ovation from the TED audience this morning.
An absolute pleasure to interview him on the red circle.
We dove into the big questionsβsuperintelligence, national strategy, open source, and what it means to be human in the age of AI.
One for the books.
12.04.2025 00:52 β π 2 π 0 π¬ 0 π 0
TikTok ban imminent, yet funny how things change.
>2020: Stressed about TikTok drama at 120K subs.
>2024: Sitting at 994K and completely unfazed.
Ban it? Cool, Iβll build elsewhere. Keep it? Roger that, Iβll double down.
The game is bigger than any one app. Who cares about vanity metrics.
14.01.2025 02:47 β π 5 π 0 π¬ 1 π 0
Merry Christmas yβall! π
Pictured: 3d scan vs. ground truth of the feast to follow
25.12.2024 21:24 β π 7 π 0 π¬ 0 π 0
Omnidirectional 3D video of reality β damn near teleportation in a VR headset.
This $17,000 VR camera released in 2017 was ahead of its time. 17 cameras β cloud stitching β 8K x 8K stereo VR video.
The moment is ripe for a new 4d capture rig optimized for dynamic 3d gaussians. Anyone building one?
16.12.2024 00:15 β π 14 π 0 π¬ 1 π 0
Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos
Use stereo videos from the internet to create a dataset of over 100,000 real-world 4D scenes with metric scale and long-term 3D motion trajectories.
It's one of those through lines when tackling a timeless mission like mapping the world or spatial computing - VR content created for immersion becoming the foundation for teaching machines to understand how the world moves. Sometimes innovation chains together in unexpected ways! stereo4d.github.io
15.12.2024 14:29 β π 1 π 0 π¬ 1 π 0
And given we're dealing with real stereoscopic content, results are notably better than synthetic data, giving you a faithful rendition of the real-world with a diverse set of subject matter.
15.12.2024 14:29 β π 1 π 0 π¬ 1 π 0
They're using it to train this model called DynaDUSt3R that can predict both 3D structure and motion from video frames. Which means it tracks how objects move between frames while simultaneously reconstructing their 3D shape.
15.12.2024 14:29 β π 2 π 0 π¬ 1 π 0
It was always clear that stereo datasets would be valuable -- and we launched some cool VR tools with it back in 2017 (link below). But the game changer now in 2024 is the scale -- they're providing 110K clips :-) That's the kind of massive, real-world dataset that was just a dream in those days!
15.12.2024 14:29 β π 2 π 0 π¬ 1 π 0
Check out this Stereo4D paper from Google DeepMind. It's a pretty clever approach to a persistent problem in computer vision -- getting good training data for how things move in 3D. The key insight is using VR180 videos -- those stereo fisheye videos we launched back in 2017 for YouTubeVR π§΅
15.12.2024 14:29 β π 12 π 1 π¬ 1 π 0
The future isn't just virtual or augmented β it's ambient and intelligent
The Google XR unlocked event in NYC
14.12.2024 19:36 β π 2 π 0 π¬ 0 π 0
5. Image to video (remix) feature is cool, but CLEARLY needs UI like Kling/Runway motion paint so it isnβt a chaotic mess / constant game of slot machine AI
Will be interesting to do head to head comparisons with US and Chinese models Sora goes live.
09.12.2024 16:52 β π 2 π 0 π¬ 0 π 0
3. Physics still very wonky (no magic fix yet) β rhino is moving all across the ground; phones appear/disappear like itβs a magician
4. Wow is there a lot of news footage in the training data β generated night time grainy footage is no problem at all
09.12.2024 16:52 β π 1 π 0 π¬ 1 π 0
1. Sora is VERY good at generating high frequency detail (video doesnβt seem blurry at all) β itβs the most impressive quality to me
2. As expected, Sora is great at well imaged landmarks β AIβs ability to generate custom βstockβ footage remains promising
09.12.2024 16:52 β π 1 π 0 π¬ 1 π 0
MKBHD dropped his OpenAI Sora review (after a week of testing) the much hyped AI video model.
5 immediate observations:
09.12.2024 16:52 β π 2 π 0 π¬ 1 π 0
YouTube video by Creative Tech Digest
AI Just Changed 3D Forever: Genie 2, World Labs, CAT4D
The future of 3D AI took some serious leaps -- from single images to fully interactive, dynamic 3D worlds. Here's what's cooking at the cutting edge: youtu.be/T7bcYSSSC6s
07.12.2024 21:21 β π 2 π 0 π¬ 0 π 0
Wav2lip can FINALLY rest in peace. Being able to retarget the facial performance of characters in *existing* live action & CG video makes Act-One an extremely useful tool for all types of creators.
Nicely done RunwayML!
06.12.2024 16:25 β π 1 π 0 π¬ 0 π 0
the entire bay area quaked hearing that chatgpt pro is gonna cost $200/month
05.12.2024 19:13 β π 4 π 0 π¬ 0 π 0
Very cool! Would love to see a workflow breakdown
05.12.2024 03:28 β π 0 π 0 π¬ 0 π 0
Genie 2: A large-scale foundation world model
Generating unlimited diverse training environments for future general agents
The race for building the biggest, baddest world model is very much on. Meanwhile, all I can think is "if only Stadia was still around!"
Check out the various results (and some fun outtakes) below: deepmind.google/discover/blo...
04.12.2024 17:07 β π 0 π 0 π¬ 0 π 0
Not quite ready for prime time, but promising on two fronts:
1. For game developers: enabling rapid prototyping of interactive experiences straight from concept art
2. For AI research: providing unlimited, diverse 3D environments for training and testing AI agents
04.12.2024 17:07 β π 2 π 0 π¬ 1 π 0
Right now Genie 2 can generate consistent worlds for up to a minute. And this world model seems to generate larger 3D worlds than what World Labs showcased yesterday. Plus they're dynamic vs. static worlds β the foliage moves in the wind, the water ripples etc.
04.12.2024 17:07 β π 3 π 0 π¬ 1 π 0
Imagine making 2D concept art for a game world βpressing a button β and suddenly you can walk around an interactive 3D world. That's what Google DeepMind's new paper Genie 2 can do β simulate virtual worlds, including the consequences of any action (e.g. unlock door, jump, swim etc).
04.12.2024 17:07 β π 8 π 0 π¬ 2 π 0
It's the same reason people browse Zillow houses or watch shows about mansions. AI or not β software reviews simply don't hit the same.
04.12.2024 01:16 β π 1 π 0 π¬ 1 π 0
Observed: All mega popular tech creators focus on hardware β there's no MKBHD for software. It's literally called "Unbox Therapy" for a reason. Even if people won't buy the devices, there's something about vicariously living through that tech review experience.
04.12.2024 01:16 β π 2 π 0 π¬ 1 π 0
Tencentβs open weights Hunyuan Video 13B model looks impressive β oh, and image-to-video and facial performance? Theyβre coming too.
If 2024 was the year open-source LLMs caught up with closed-source AI β 2025 will be the year open-source video catches up.
03.12.2024 16:58 β π 5 π 0 π¬ 1 π 0
World Labs first demo dropped, and itβs consistent 3D worlds from a single 2D image.
Decent volume size to move around in β def a big step up from the RGB + depth 360 environments weβre used to e.g. Blockade Labs.
Stylized results look good; iβd love to see more photorealistic AI generations!
02.12.2024 17:21 β π 11 π 2 π¬ 1 π 1
Google 2.5D temporal data, very nice.
30.11.2024 21:28 β π 6 π 1 π¬ 0 π 0
Augmented reality x-ray vision to βsee throughβ concrete.
Your infrastructure wonβt just be scanned β itβll be anchored to reality.
Demo: Pix4D reality capture with precise geospatial localization.
29.11.2024 00:54 β π 3 π 0 π¬ 0 π 0
Video compression is pretty bad compared to X and Threads too
28.11.2024 04:00 β π 3 π 0 π¬ 0 π 0
Forensic 3D Animation. VFX. Generative AI.
XR/VR dev. IMDB: http://imdb.com/name/nm0960409β¦
Program Manager ML & AI @ Google Research | Ex-Google Brain. Speaker (FR/EN)
Abdoulaye.ai
Opinions are my own. He/His
A lot of my retweets and likes are for bookmarking purposes.
π Accra, Ghana
(cover photo: Dar Es Saalam, circa 2015)
Writer mostly. Reader often. Works in AI. Plays in the PNW.
Joined Twitter in June 2006 and was user #314. Keynote speaker at Adobeβs Worldwide Conference. Featured in NYT. Beta partner w/ Google, YouTube, FB, Snap, Insta, and X. Musician. English major. #RIPGoogleReader
https://ethanbholland.com/about/
Film direction & production. Movies: TIMESCAPE, BATTLE FOR TERRA, ALIEN INTERVIEW. Awards: Sundance, Hollywood, Clio.
AI, national security, China. Part of the founding team at @csetgeorgetown.bsky.social⬠(opinions my own). Author of Rising Tide on substack: helentoner.substack.com
Co-founder: BlackBox Infinite.
Vision+Design for the next era of digital.
30+ MCU films, Future-Tech, Hypercars.
BlackBoxInfinite.com
Rincewind-in-Residence, University of Birmingham https://www.stephengriffin.org/
I love technology and have recently moved to the terminal. Rabbit Hole is my middle name
https://www.outatime-podcast.de/
Global Fellow Singularity University
Founder @together.online. All things spatial computing, software engineering and systems thinking. Prev: Spatial, Google AR/VR. MIT alum. I know way too much about computers
πNYC π descioli.com
Into creative ML/AI, NLP, data science and digital humanities, narrative, infovis, games, sf & f. Consultant, ds in Residence at Google Arts & Culture. (Lyon, FR) Newsletter arnicas.substack.com.
data janitor @ alignmentlab.ai, chief raccoon @ rakun.ai
Prev: orbitinsure, openmeds, drgupta.ai
Design lead @ Google Labs π§ͺ
βββ prev βββ
β¨YouTube GenAI & creator tools
πͺGoogle ATAP
πGoogle Home/Nest
π Google Wifi & IoT
I read a lot of research.
Currently reading: https://dmarx.github.io/papers-feed/
Statistical Learning
Information Theory
Ontic Structural Realism
Morality As Cooperation
Epistemic Justice
YIMBY, UBI
Research MLE, CRWV
Frmr FireFighter
π«π· Entrepreneur & Creator β’ 10 years building ARΓVR & consumer apps π©βπ¬ β’ prev: CTO Opuscope, dev vlc Β· @microsoft Regional Director awardee
Other interests: π Climate, Nuke & Renewables β’ π°οΈ Space β’ ποΈ Mountains