bskyView

Roy Frostig

@froystig.bsky.social

research scientist at google deepmind. co-author of JAX (https://github.com/jax-ml/jax). https://cs.stanford.edu/~rfrostig

715 Followers | 135 Following | 5 Posts | Joined: 25.11.2024

Posts Following

Posts by Roy Frostig (@froystig.bsky.social)

And for diffusion specifically, there are several reference models implemented in github.com/AI-Hypercomp..., in case you hadn't come across it already and it suits some of your needs.

06.02.2025 07:04 — 👍 0 🔁 0 💬 1 📌 0

are useful beyond that. We have good input here already, but DM or email me if you'd ever like to talk with the team some more. Either way we appreciate it!

06.02.2025 06:07 — 👍 0 🔁 0 💬 1 📌 0

Indeed it's great to hear from you! And thanks for all of this detail. I've shared it with team members who've started working on models. You'll see more on the transformers side at first since that's already underway (and e.g. relates to the book upthread) but your points on diffusion and GNNs ...

06.02.2025 06:07 — 👍 0 🔁 0 💬 1 📌 0

@nmboffi.bsky.social – We have some plans to improve that this year. As examples, do you have any models in particular that you'd really like to see? Does training, tuning, inference, or anything else matter most to you? What hardware?

05.02.2025 06:38 — 👍 4 🔁 0 💬 1 📌 0

Training our most capable Gemini models relies heavily on our JAX software stack+Google's TPU hardware platforms.

If you want to learn more, see this awesome book "How to Scale Your Model":

jax-ml.github.io/scaling-book/

Put together by several of my Google DeepMind colleagues listed below 🎉.

04.02.2025 19:51 — 👍 76 🔁 12 💬 2 📌 1

Our online book on systems principles of LLM scaling is live at jax-ml.github.io/scaling-book/

We hope that it helps you make the most of your computing resources. Enjoy!

04.02.2025 18:59 — 👍 35 🔁 9 💬 3 📌 1