Vimal Thilak's Avatar

Vimal Thilak

@aggieinca.bsky.social

ML Engineer-ist @ Apple Machine Learning Research

17 Followers  |  28 Following  |  5 Posts  |  Joined: 28.01.2025
Posts Following

Posts by Vimal Thilak (@aggieinca.bsky.social)

Video thumbnail

Today we have released the code and a demo iOS application for FastVLM - our extremely efficient and fast vision language model which runs on your device using MLX! You can check out the code and the app here: github.com/apple/ml-fas...

07.05.2025 22:20 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Post image

#ICLR #TrainBetterLM I am at ICLR, come to our posters for improved language model training!

Recycle gradients for faster neural net training with AdEMAmix iclr.cc/virtual/2025... (Fri Apr 25, 10 am).

1/3

21.04.2025 23:54 β€” πŸ‘ 2    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

Check our Pau and his Apple MLR team's blogpost on activation transport! Soon to be featured as a spotlight at ICLR :)

11.04.2025 22:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

More scaling laws? Mustafa and his team at Apple MLR have you covered at least when ti comes to native multimodal models scaling laws :)

11.04.2025 22:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Calling all SSL practitioners -- check out this library done by the amazing \alpha-Req crew

05.04.2025 03:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Paper🧡 (cross-posted at X): When does composition of diffusion models "work"? Intuitively, the reason dog+hat works and dog+horse doesn’t has something to do with independence between the concepts being composed. The tricky part is to formalize exactly what this means. 1/

11.02.2025 05:59 β€” πŸ‘ 39    πŸ” 15    πŸ’¬ 2    πŸ“Œ 2
Post image

Excited to share Soup-of-Experts, a new neural network architecture that, for any given specific task, can instantiate in a flash a small model that is very good on it.

Made with ❀️ at Apple

Thanks to my co-authors David Grangier, Angelos Katharopoulos, and Skyler Seto!

arxiv.org/abs/2502.01804

05.02.2025 09:32 β€” πŸ‘ 12    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

🚨 Apple Machine Learning Research Internship opportunity! My colleagues in Apple MLR are looking for a PhD research intern with a strong interest in reinforcement learning/post-training for LLMs. If interested, apply by sending an email to Etai Littwin (elittwin at apple dot com)

07.02.2025 23:41 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1

Mixture of experts is an interesting architecture or so @samiraabnar.bsky.social told me when I joined the project last year. After some brilliant work from @harshay-shah.bsky.social and @samiraabnar.bsky.social , we have a scaling law paper!

28.01.2025 18:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0