Stephan Hoyer's Avatar

Stephan Hoyer

@stephanhoyer.com.bsky.social

Building AI climate models at Google. I also contribute to the scientific Python ecosystem, including Xarray, NumPy and JAX. Opinions are my own, not my employer's.

1,703 Followers  |  466 Following  |  33 Posts  |  Joined: 11.07.2023  |  1.747

Latest posts by stephanhoyer.com on Bluesky

Do you take it yourself?

13.05.2025 15:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think the problem is the algorithm. BlueSky's lack of a recommendation engine means that if you're not posting all the time, your stuff doesn't get seen.

06.05.2025 15:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Why PyTorch is an amazing place to work... and Why I'm Joining Thinking Machines In which I convince to you to join either PyTorch or Thinking Machines!

The "ungamable impact" of OSS really resonates with me:
www.thonking.ai/i/158277004/...

Sadly it does not necessarily align with what makes for a sucessful career in Big Tech. But it's worth trying anyways! :)

04.03.2025 23:22 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think it's just about readability with small font, the same reason why printed newspapers use many columns.

02.02.2025 20:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The losses here should be marked as millions not billions, right?

27.01.2025 17:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Pretty much anything that you can write in high level array code like NumPy is very fast in JAX. Only intrinsically very loopy code is (relatively) slow, but JAX has excellent support for writing custom kernels in lower level languages.

23.01.2025 06:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

AD compatible Python is at the cutting edge of performance these days with it's central role in large-scale AI training.

In my experience (mostly geophysical fluid dynamics) JAX has comparable perf to modern Fortran on CPUs, with a much easier path to GPUs and multi-device code.

23.01.2025 01:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Those are tiny chunks! Does that reduce max throughput for analytics use-cases compared to larger chunks?

10.01.2025 21:44 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Such exciting news!

For anyone who has tried the new sharding feature -- do you have any guidance on optimal shard sizes, if I want more flexibility in access patterns but still optimal throughput?

10.01.2025 03:17 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Is there a link between #ClimateChange & increasing risk/severity of #wildfire in California--including the still-unfolding disaster? Yes. Is climate change the only factor at play? No, of course not. So what's really going on? [Thread] #CAfire #CAwx #LAfires iopscience.iop.org/a...

09.01.2025 22:05 β€” πŸ‘ 797    πŸ” 367    πŸ’¬ 30    πŸ“Œ 73

This is a huge milestone for cloud-native big scientific data!

09.01.2025 23:55 β€” πŸ‘ 23    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Preview
The Risky Business of Predicting Where Climate Disaster Will Hit Climate tech companies can calculate the chances that a flood or wildfire will ravage your home. But what if their odds are all different?

Hi, thanks for the mention. Here's a 7-day paywall-free link to the main feature: www.bloomberg.com/graphics/202...

30.12.2024 17:27 β€” πŸ‘ 11    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Preview
Correcting Weather and Climate Models by Machine Learning Nudged Historical Simulations Nudging an atmospheric model toward observations is a good way to estimate state-dependent biases Machine learning of state-dependent biases improves hindcast skill of a coarse-resolution general...

This paper by Watt-Meyer et al is a good example of "Error-based learning:" agupubs.onlinelibrary.wiley.com/doi/10.1029/...

ECMWF has also done similar work on top of IFS's data assimilation system.

28.12.2024 20:35 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Β‘AI Caramba! Rapid progress in the use of machine learning for weather and climate models is evident almost everywhere, but can we distinguish between real advances and vaporware? First off, let's define some...

Some thoughts on the use of AI/ML in climate modeling...

@realclimate.org

Β‘AI Caramba! www.realclimate.org/index.php/ar...

28.12.2024 19:36 β€” πŸ‘ 86    πŸ” 31    πŸ’¬ 6    πŸ“Œ 3
WeatherBench 2 Data Guide β€” WeatherBench 2 documentation

We have a few pre-computed climatologies in WeatherBench2: weatherbench2.readthedocs.io/en/latest/da...

27.12.2024 01:39 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
NeuralGCM update: new models, new license, new datasets

We have a few other updates to share as well, which can be found in the inaugral edition of the NeuralGCM newsletter:
groups.google.com/g/neuralgcm-...

The biggest one is that NeuralGCM models are now freely available for everyone to use, including for commercial purposes!

19.12.2024 20:34 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Neural general circulation models optimized to predict satellite-based precipitation observations Climate models struggle to accurately simulate precipitation, particularly extremes and the diurnal cycle. Here, we present a hybrid model that is trained directly on satellite-based precipitation obs...

Can incorporating AI improve precipitation in global weather and climate models?

Yes! In the latest NeuralGCM paper, we show that training on satellite-based precipitation results in significant improvements over traditional atmospheric models:
arxiv.org/abs/2412.11973

19.12.2024 20:34 β€” πŸ‘ 34    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Simplifying analysis of hierarchical HDF5 and NetCDF4 files with xarray-datatree NASA’s Earth Observing System Data and Information System (EOSDIS) contains tho...

Please reach out if you want to chat about anything related to AI modeling, NeuralGCM, JAX or Xarray. Also see Eni's poster on xarray.DataTree on Thurs: agu.confex.com/agu/agu24/me...

09.12.2024 17:47 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Interested in AI weather/climate modeling at #AGU24?

I'll be giving an overview talk on NeuralGCM at 11:30am Wed at the Google booth, and an talk on modeling precipitation with NeuralGCM at 4:25pm Wed in the session A34A.

09.12.2024 17:42 β€” πŸ‘ 35    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1

When I hear "ML" I tend to think of old school (i.e., scikit-learn) machine learning, which is great but much less powerful than deep learning. So I would opt for "AI weather models" though that misses quite a bit of nuance.

07.12.2024 19:01 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This diagram is accurate historically, but recently AI seems to have become synonymous with deep learning.

07.12.2024 18:57 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The bottleneck for traditional models is data movement within the CPU, not data transfer to disk -- physics based simulations do too little compute per byte (low arithmetic intensity) to fully utilize modern hardware.

AI is way better in this respect. It's easy to use lots of FLOPs on big matmuls!

07.12.2024 07:22 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Unlimited potential, zero bugs!

01.12.2024 13:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There's nothing like the feeling of starting a codebase from scratch.

01.12.2024 01:32 β€” πŸ‘ 34    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

To my knowledge, there are no limits on accessing ARCO-ERA5 or other public datasets stored in Google Cloud Storage. You don't even have to be logged in!

28.11.2024 17:17 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

One of my favourite data discoveries this year: Google's mind-blowing ARCO-ERA5 dataset: hourly data for ~300 climate variables, available globally from 1940! 🀯

Loadable with a single line of Python code from a single cloud-friendly Zarr file! Below: a month of wind waves + swell: 🌊

27.11.2024 04:02 β€” πŸ‘ 694    πŸ” 158    πŸ’¬ 34    πŸ“Œ 15

Fast JAX simulations of all the PDEs--whee!

27.11.2024 06:59 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Let me know if you're headed to AGU :)

26.11.2024 05:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Any recommendations for those of us building neural PDE models?

26.11.2024 04:00 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@stephanhoyer.com is following 20 prominent accounts