Soumith Chintala

Soumith Chintala

@soumithchintala.bsky.social

Cofounded and lead PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source. http://soumith.ch

10,284 Followers 895 Following 19 Posts Joined Jan 2024
11 months ago
Post image

A few months ago we quietly open-sourced a PyTorch video decoding library called torchcodec -- small, nimble, fast, supports GPU decoding via ffmpeg.

The Hugging Face folks had some nice things to say about it as they integrated it into LeRobot.

Check it out here: github.com/pytorch/torc...

45 4 0 0
1 year ago
Post image

the Aria 2 glasses are pretty great for robot data collection.
they're also getting really good for general agentic use...
Read the full announcement here: www.meta.com/blog/project...

13 2 1 0
1 year ago

i added an example here: bsky.app/profile/soum...

5 0 1 0
1 year ago

what I'm finding is that, the models want to be more of an artist than a replacement for photoshop -- which is fine, but I want to be the artist here, and want the tool to be more of a "magically easier photoshop where I ask it what to do in detail, and it does that -- not more not less"

9 0 2 0
1 year ago

i'll give a representative but not exact example:

change the color of X's shirt from blue to red: the generations often change the entire shirt style itself -- they don't respect how much and what I'm trying to change, and dont try to preserve details I ask to preserve

8 0 2 1
1 year ago

what are AI products that allow me to transform existing images, while preserving some selective details (that i select), like faces, areas, etc.?

the tools I've used so far only take the selection as a hint, or dont generate well around the selection?

trying personalized art

18 0 3 0
1 year ago

3. They've also made it easy to load MJCF and other common specs used in robotics. They've also made visualization work out of the box (they hacked up a hybrid of pyrender, pyglet and LuisaRender with a ton of their own patches).

4 0 0 0
1 year ago

2. The APIs are reasonably simple and well-designed, and they did take out the cross-platform pain in many ways -- CPU, CUDA, Metal etc. are all supported across Linux, OSX, Windows -- thanks to Taichi (and to a small part PyTorch).

3 0 1 0
1 year ago

1. It's nice that the internals are written with Taichi, so all the sim code is written in python, more accessible and easy-to-read than retrofitting physics on top of a Tensor compiler (like mujoco did with MJX) and possibly faster because Taichi is a more suited DSL / compiler.

1 0 1 0
1 year ago

The whole GenAI/LLM/VLM stuff seems to be unreleased or "aspirational".
My favorite aspects:

1 0 1 0
1 year ago

It's basically like Mujoco but with more advanced materials/rendering/solvers, written all in Python thanks to being powered by Taichi, which makes it much more accessible.
I like it a lot. It's very accessible.
They went too far with marketing, but willing to ignore it for now.

7 1 1 0
1 year ago
Genesis

i rabbit-holed into the Genesis Sim codebase because it went viral on X, and the website is hypey and unclear; and I didn't want to just blindly retweet.
genesis-embodied-ai.github.io
πŸ‘‡

40 3 1 0
1 year ago

also, congrats OpenAI on O3, and thank you for rapidly making progress on intelligence.

1 1 0 0
1 year ago

Models are dumb as rock without the right context -- pretrained context doesn't help with day-to-day or specialized things.
Private ecosystems and company bureaucracies means you have to feed the models your own context for the next X years....unless computer-use gets ready.
Cant wait for it!

6 1 1 0
1 year ago

intelligence is starting to get good, but context is still siloed for stupid reasons.
get models that do human-level computer-use already, please...!

12 2 1 0
1 year ago

Glean for personal/self-hosted: is there an open source / self-hosted project that integrates pulling context from gmail, docs, sheets, calendar, whatsapp, ig, imessage, etc.?

15 2 1 0
1 year ago
Video thumbnail

I'd like to introduce what I've been working at @hellorobot.bsky.social: Stretch AI, a set of open-source tools for language-guided autonomy, exploration, navigation, and learning from demonstration.

Check it out: github.com/hello-robot/...

Thread ->

132 23 6 4
1 year ago

so much detail, it's incredible that you've gotten this deep....twice ☺️!!!

4 0 1 0
1 year ago

hi sup!

3 0 0 0
1 year ago
Video thumbnail

Very excited about this new project, DynaMem. It allows our robots to function in previously unseen environments, performing long-horizon manipulation tasks. Most importantly it *generalizes*, meaning you can try it out on a wide variety of homes and on different objects. (4x video)

31 6 2 1
1 year ago

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

553 212 67 55
1 year ago

what are good starter packs for: AI researchers, AI Systems people, GenAI hackers, LLM enthusiasts?

44 2 7 0