A few months ago we quietly open-sourced a PyTorch video decoding library called torchcodec -- small, nimble, fast, supports GPU decoding via ffmpeg.
The Hugging Face folks had some nice things to say about it as they integrated it into LeRobot.
Check it out here: github.com/pytorch/torc...
the Aria 2 glasses are pretty great for robot data collection.
they're also getting really good for general agentic use...
Read the full announcement here: www.meta.com/blog/project...
i added an example here: bsky.app/profile/soum...
what I'm finding is that, the models want to be more of an artist than a replacement for photoshop -- which is fine, but I want to be the artist here, and want the tool to be more of a "magically easier photoshop where I ask it what to do in detail, and it does that -- not more not less"
i'll give a representative but not exact example:
change the color of X's shirt from blue to red: the generations often change the entire shirt style itself -- they don't respect how much and what I'm trying to change, and dont try to preserve details I ask to preserve
what are AI products that allow me to transform existing images, while preserving some selective details (that i select), like faces, areas, etc.?
the tools I've used so far only take the selection as a hint, or dont generate well around the selection?
trying personalized art
3. They've also made it easy to load MJCF and other common specs used in robotics. They've also made visualization work out of the box (they hacked up a hybrid of pyrender, pyglet and LuisaRender with a ton of their own patches).
2. The APIs are reasonably simple and well-designed, and they did take out the cross-platform pain in many ways -- CPU, CUDA, Metal etc. are all supported across Linux, OSX, Windows -- thanks to Taichi (and to a small part PyTorch).
1. It's nice that the internals are written with Taichi, so all the sim code is written in python, more accessible and easy-to-read than retrofitting physics on top of a Tensor compiler (like mujoco did with MJX) and possibly faster because Taichi is a more suited DSL / compiler.
The whole GenAI/LLM/VLM stuff seems to be unreleased or "aspirational".
My favorite aspects:
It's basically like Mujoco but with more advanced materials/rendering/solvers, written all in Python thanks to being powered by Taichi, which makes it much more accessible.
I like it a lot. It's very accessible.
They went too far with marketing, but willing to ignore it for now.
i rabbit-holed into the Genesis Sim codebase because it went viral on X, and the website is hypey and unclear; and I didn't want to just blindly retweet.
genesis-embodied-ai.github.io
π
also, congrats OpenAI on O3, and thank you for rapidly making progress on intelligence.
Models are dumb as rock without the right context -- pretrained context doesn't help with day-to-day or specialized things.
Private ecosystems and company bureaucracies means you have to feed the models your own context for the next X years....unless computer-use gets ready.
Cant wait for it!
intelligence is starting to get good, but context is still siloed for stupid reasons.
get models that do human-level computer-use already, please...!
Glean for personal/self-hosted: is there an open source / self-hosted project that integrates pulling context from gmail, docs, sheets, calendar, whatsapp, ig, imessage, etc.?
I'd like to introduce what I've been working at @hellorobot.bsky.social: Stretch AI, a set of open-source tools for language-guided autonomy, exploration, navigation, and learning from demonstration.
Check it out: github.com/hello-robot/...
Thread ->
so much detail, it's incredible that you've gotten this deep....twice βΊοΈ!!!
hi sup!
Very excited about this new project, DynaMem. It allows our robots to function in previously unseen environments, performing long-horizon manipulation tasks. Most importantly it *generalizes*, meaning you can try it out on a wide variety of homes and on different objects. (4x video)
New here? Interested in AI/ML? Check out these great starter packs!
AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS
You can also search all starter packs here: blueskydirectory.com/starter-pack...
what are good starter packs for: AI researchers, AI Systems people, GenAI hackers, LLM enthusiasts?