YouTube video by Tesla
This Robot Sucks
Is anyone else baffled by this? Why build this if you have the Optimus robot? Based on the teleoperation + current behavior cloning paradigm, it should be exceedingly trivial to learn the repetitive sequence of motions to handle cleaning this car. What am i missing?
m.youtube.com/watch?v=vVFX...
12.02.2025 16:33 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
FSD not great in SF.
1. Doesnโt follow taxi + bus lanes
2. Doesnโt understand streets have a lane thatโs sometimes parking sometimes driving
3. Always messes up left turns onto van ness
4. Always have to intervene for Valencia/market intersection
10.02.2025 22:03 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Also, the technique makes robotics more accessible while trading understandability. People find that useful in other tools (llm, gen image models). This makes me think this will be useful also.
10.02.2025 21:25 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
DexGen
I was wrong about this. My change of heart was inspired by dexgen (zhaohengyin.github.io/dexteritygen/). They use BC for course motion, RL for fine grained. Instant policy could just do both in a single shot. Instant policy would be great for tele-op
10.02.2025 21:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Deepseek r1 proves Yann LeCunโs thesis that models need physical grounding
Deepseek: good prior + easy verification = ๐ฅ.
Good prior = video
easy verification = robotics in physical world
09.02.2025 23:33 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
If time travel existed, wouldnโt we have met future us by now? Maybe itโs a beaconโyou can only travel back to when it was first turned on. No paradox, just a switch we havenโt flipped yet.
09.02.2025 18:43 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Goal: vlm capable of planning + action
Problem: no data
Solution:
Bootstrap intelligence from vlm
1. Start with off the shelf vlm
2. Collect rollouts from code as policy from vlm for a set of tasks
3. GRPO over rollouts
4. Goto 2
5. Offline RL over vlm for direct obs->act
01.02.2025 22:06 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Letโs revisit code as policy in the deepseek-r1 era
01.02.2025 21:55 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Why is instant policy better than using point cloud matching/alignment approaches? Can we not just sprinkle genAI on everything without making a clear cut case for why scaling a method with DL is better than classic approach?
30.01.2025 20:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
This btw is just regular old RL + curriculum learning. However, unlike control RL where we learn on the test set, here we have diff envs from train and test time.
30.01.2025 18:32 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Recipe.
1. Start with weak base + problems that range from really simple to really hard
3. Sort problems by how well model does on them
4. Pick 75% problems model can do, 25% it canโt. RL with GRPO. Use 10x rollouts + high temp on 25%
5. Repeat step 3 till all problems solved
30.01.2025 18:30 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Been seeing โcompetence thresholdโ take hold as an idea. High level: for a model to benefit from RL, it has to have some minimal competence. My view is, competence threshold is a function of the number of rollouts per question. Larger the number of rollouts, less competent the base model has to be.
30.01.2025 18:18 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Rerun is the best ml logger Iโve used. Miles better than wandb and tensorboard.
29.12.2024 19:05 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Turns out problem was too low resolution. Going higher res fixed the problem.
26.12.2024 22:43 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
On second thought, rectified flow doesnโt help with this problem. Need to look for something different
26.12.2024 04:25 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
The samples look like the mean of the input. Suspicion that the similar examples result in a nullcline in velocity when t -> 1. Rectified flow should fix this. But, I hate the multi stage process of reflow.
25.12.2024 21:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Anyone know how to improve sample diversity with flow matching when examples are really similar?
25.12.2024 20:37 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Why write clean code for ml? Took me ten min to convert a video diffusion training library to flow matching.
23.12.2024 18:39 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
Pre-LN transformer layers donโt work for perceivers with fixed encodings
22.12.2024 16:57 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Declaring the death of model scaling is premature.
Regardless of whether model scaling will continue, industry leadersโ flip flopping on this issue shows the folly of trusting their forecasts. They are not significantly better informed than the rest of us, and their narratives are heavily influenced by their vested interests.
Inference scaling is real, and there is a lot of low-hanging fruit, which could lead to rapid capability increases in the short term. But in general, capability improvements from inference scaling will likely be both unpredictable and unevenly distributed among domains.
The connection between capability improvements and AIโs social or economic impacts is extremely weak. The bottlenecks for impact are the pace of product development and the rate of adoption, not AI capabilities.
New AI Snake Oil essay: Last month the AI industry's narrative suddenly flipped โ model scaling is dead, but "inference scaling" is taking over. This has left people outside AI confused. What changed? Is AI capability progress slowing? We look at the evidence. ๐งต www.aisnakeoil.com/p/is-ai-prog...
19.12.2024 12:16 โ ๐ 119 ๐ 37 ๐ฌ 3 ๐ 9
Keeping functionality in modular libraries helps everyone doesnโt have to rewrite everything. Least cohesion of modules is a great rule of thumb.
18.12.2024 21:11 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Next scale prediction demonstrates how to explicitly simulate CNNs using transformers.
16.12.2024 03:03 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Thatโs the best board Iโve ever see
14.12.2024 02:11 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
A self driving car dataset has examples going left around a tree and right around a tree. The algorithm averages the two and goes straight into the tree. Hallucination.
05.12.2024 01:01 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
The reward signal here would be if the user accepts some task completion or not. Youโre fine tuning to improve the success rate of the agents taking into account the multi-turn decision making nature of agents.
30.11.2024 01:06 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
I agree that you can do a lot with the foundation. But you could always do more. Offline RL techniques can make this happen. A good number of them make the mdp assumption.
30.11.2024 01:04 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
My assumption here is that most ppl working on this are fine tuning to make agents. They have some reward dataset. Under POMDP, they could use a PPO library that works well in sequential decision making. With this, you avoid writing any alg code.
29.11.2024 19:17 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Engineering/product driven. Many existing decision making algorithms are based on POMDPs. Itโs faster to use existing libraries to sketch promising agent use cases.
29.11.2024 18:36 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
They still have a decision making casual structure - the future canโt influence the past.
29.11.2024 17:25 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0
machine learning, causal inference, healthcare - assistant professor in dep. of Data Science Methods, Julius Center, of University Medical Center Utrecht, the Netherlands; wvanamsterdam.com
๐ Co-founder prometheus.io
๐ข Founder promlabs.com and promcon.io
๐จ๐ผโ๐ซ Teaching monitoring with Prometheus: training.promlabs.com
Chief Scientist at Inverted AI
Bahvioral modelling for autonomous driving
PhD physicist turned AI researcher
Statistician, Associate Professor (Lektor) at University of Gothenburg and Chalmers; inference and conditional distributions for anything
https://mschauer.github.io
http://orcid.org/0000-0003-3310-7915
[หmoห/r/ษชts หสaสฬฏษ]
PhD student @ UW, research @ Ai2
PhD candidate @LSE Philosophy.
Thinking about minds and ghosts.
Working on AI, cognition, consciousness, animal intelligences and sentience, sci epistemology, tech-related ethics.
https://www.dzakharova.com
Multi-Agent Researcher at CAIF | applied research at IQT | Thinking about making MA systems go well
He / Him. Founder / Technologist. Empathy / Kindness โค๏ธ. ๐ฎ๐น๐ฌ๐ง๐จ๐พ
Writing about AI agents at AgentsDecoded.com
PhD student @Berkeley_AI
reinforcement learning, AI, robotics
Bioinformatics Scientist / Next Generation Sequencing, Single Cell and Spatial Biology, Next Generation Proteomics, Liquid Biopsy, SynBio, Compute Acceleration in biotech // http://albertvilella.substack.com
Music, audio, and deep learning research at Stability AI ~ Building bridges between audio signal processing wisdom and deep learning.
artintech.substack.com
www.jordipons.me
I love James Harden & I write about abortioneveryday.com / free Palestine
kyliewrites.net
Researcher & faculty member @DPKM dedicated to the field of AI, with the focus on knowledge technologies (knowledge graphs, semweb, RAG) & their use in e-gov, skills matching, research ecosystem, digital humanities and education. Partner @km-a.bsky.social.
Bot by @lemonodor.bsky.social that posts about aircraft flying in circles over Los Angeles.
Every circle tells a story.
ElonJet | Tracking Elon Muskโs Jet by @jacks.grndcntrl.net
Also follow @spacexjets.grndcntrl.net
Tracking Taylor Swift's Jet N621MM
Tracking Bill & Melinda Gates Jets by @jacks.grndcntrl.net
Tracking Jeff Bezo's Jets