And for diffusion specifically, there are several reference models implemented in github.com/AI-Hypercomp..., in case you hadn't come across it already and it suits some of your needs.
06.02.2025 07:04 — 👍 0 🔁 0 💬 1 📌 0
are useful beyond that. We have good input here already, but DM or email me if you'd ever like to talk with the team some more. Either way we appreciate it!
06.02.2025 06:07 — 👍 0 🔁 0 💬 1 📌 0
Indeed it's great to hear from you! And thanks for all of this detail. I've shared it with team members who've started working on models. You'll see more on the transformers side at first since that's already underway (and e.g. relates to the book upthread) but your points on diffusion and GNNs ...
06.02.2025 06:07 — 👍 0 🔁 0 💬 1 📌 0
@nmboffi.bsky.social – We have some plans to improve that this year. As examples, do you have any models in particular that you'd really like to see? Does training, tuning, inference, or anything else matter most to you? What hardware?
05.02.2025 06:38 — 👍 4 🔁 0 💬 1 📌 0
Training our most capable Gemini models relies heavily on our JAX software stack+Google's TPU hardware platforms.
If you want to learn more, see this awesome book "How to Scale Your Model":
jax-ml.github.io/scaling-book/
Put together by several of my Google DeepMind colleagues listed below 🎉.
04.02.2025 19:51 — 👍 77 🔁 12 💬 2 📌 1
Our online book on systems principles of LLM scaling is live at jax-ml.github.io/scaling-book/
We hope that it helps you make the most of your computing resources. Enjoy!
04.02.2025 18:59 — 👍 34 🔁 9 💬 3 📌 1
Safe and robust AI/ML, computational sustainability. Former President AAAI and IMLS. Distinguished Professor Emeritus, Oregon State University. https://web.engr.oregonstate.edu/~tgd/
Author of PyTorch, Research Scientist at Google DeepMind. Currently working on Pallas, Mosaic and dex-lang. MIMUW CS & Math graduate.
Senior Research Director at Google DeepMind in our San Francisco office. I created Magenta (magenta.withgoogle.com) and sometimes find time to be a musician.
Algorithms for Toddlers (https://youtu.be/nnLOi3ia210) | Algorithms for Teenagers (https://tinyurl.com/2cnp39cf) | Algorithms for Grown Ups (http://dblp.org/pid/11/10308)
Dimensionality Diabolist, Seeker of Optima
The world's leading venue for collaborative research in theoretical computer science. Follow us at http://YouTube.com/SimonsInstitute.
Mathematician at UCLA. My primary social media account is https://mathstodon.xyz/@tao . I also have a blog at https://terrytao.wordpress.com/ and a home page at https://www.math.ucla.edu/~tao/
Welcome to the Official Bluesky account for Caltrain. We provide news, info & answers. Follow @caltrainalerts.bsky.social for service delays. 🚇
Building generative models for high-dimensional science and engineering.
Assistant prof. @CarnegieMellon & affiliated faculty @mldcmu, previously instructor @NYU_Courant, PhD jointly @Harvard and @MIT
https://nmboffi.github.io
Researcher at Google DeepMind. I make LLMs go fast. I also play piano and climb sometimes. Opinions my own
Google Chief Scientist, Gemini Lead. Opinions stated here are my own, not those of Google. Gemini, TensorFlow, MapReduce, Bigtable, Spanner, ML things, ...
I'm a PhD student at MIT CSAIL.
More about me: https://cs.stanford.edu/~kach
Owner, QVR. Derivatives and volatility. @benegotherit.bsky.social 's ex financial advisor (fired and replaced by @kchoudhu.anserinae.net)
dottore, ingegnere, avvocato
When the Clock Broke: Con Men, Conspiracists, and How America Cracked Up in the Early 1990s — out now from
@fsgbooks
https://www.unpopularfront.news/
I develop tough benchmarks for LMs and then I build agents to try and beat those benchmarks. Postdoc @ Princeton University.
https://ofir.io/about
Code, AI, and 3D printing. Opinions are my own, not my computer's...for now. Co-creator of DALL-E 2. Researcher @openai.
This is the official account for BART. We provide train service throughout the San Francisco Bay Area and connect people to places they love.
For automated service updates go to @alerts.bart.gov
Visit us at bart.gov 🚇💙