Stephan Rasp raspstephan - Bluesky Statics

Other minor updates:
- Where available, we added 2022 as an eval year in the interactive graphics.
- We added forecast activity as a metric for deterministic models, a simple measure of blurring.
- More regions.

Don't hesitate to file bugs or suggestions as GitHub issues.

end/

13.02.2025 07:38 — 👍 0 🔁 0 💬 0 📌 0

Next, we added 4 new models to the public benchmark (which now also uses WB-X as a backend):
- GenCast
- Stormer
- Excarta (HEAL-ViT)
- ArchesWeather

The probabilistic scorecard finally looks a little more populated :)

4/

13.02.2025 07:38 — 👍 1 🔁 0 💬 1 📌 0

WeatherBench-X documentationContentsMenuExpandLight modeDark modeAuto light/dark mode

To get started, check out the documentation: weatherbench-x.readthedocs.io/en/latest/

For an example of evaluating forecasts against sparse obs, see: weatherbench-x.readthedocs.io/en/latest/ho...

Please don't hesitate to ask questions or report bugs/feature requests via a GitHub issue :)

3/n

13.02.2025 07:38 — 👍 1 🔁 0 💬 1 📌 0

WB-X is a complete rewrite of our evaluation code. We designed it to be as modular and powerful as possible with cutting-edge use cases like observation-based models in mind. We've used WB-X internally over the last year for most of our model development.

2/n

13.02.2025 07:38 — 👍 0 🔁 0 💬 1 📌 0

GitHub - google-research/weatherbenchX: A modular framework for evaluating weather forecasts A modular framework for evaluating weather forecasts - google-research/weatherbenchX

🚨 WeatherBench Update

1. WeatherBench-X, our new evaluation code, is now on GitHub: github.com/google-resea...

2. New models (plus other small updates) on the WeatherBench website: sites.research.google/weatherbench/

1/n

13.02.2025 07:38 — 👍 21 🔁 8 💬 1 📌 0

2025 is here tomorrow, so let's reflect on 2024. Even without the final counts and the new AMS and AGU ML journals, 2024 has eclipsed 10% of all papers and had over 600 papers mentioning neural networks in their abstracts 📈

31.12.2024 17:15 — 👍 6 🔁 2 💬 0 📌 0

Deterministic scores – WeatherBench2

Sure. The y-axis shows the 3d T850 RMSE relative to ECMWF IFS HRES (so >100% = better). It's a crude attempt at normalizing different evaluations, so don't overinterpret the small differences. This is more about the bigger picture.

23.12.2024 18:51 — 👍 1 🔁 0 💬 1 📌 0

So, for AIFS and GenCast I am evaluating the ensemble mean. I still use deterministic HRES as a reference. For AIFS I grabbed the NH HRES scores from the scorecard on the ECMWF website and then eyeballed the AIFS score from Fig 9.

23.12.2024 18:37 — 👍 1 🔁 0 💬 0 📌 0

AI-Weather SotA vs Time The purpose of this spreadsheet is not to exactly compare different models but rather to get an overall sense of progress in AI-based weather prediction.

Good idea, done: Rasp, Stephan (2024). AI-Weather SotA vs Time. figshare. Dataset. doi.org/10.6084/m9.f...

23.12.2024 18:14 — 👍 3 🔁 1 💬 0 📌 1

But you do raise a good point. for purely obs-trained models, this probably isn't a fair comparison. In this case the conclusions are probably the same but still.

23.12.2024 16:58 — 👍 1 🔁 0 💬 1 📌 0

True but in the medium-range the obs uncertainty is probably smaller than the forecast uncertainty, right? Radiosonde vs ERA5 RMSE ~ 1k, right?

23.12.2024 16:56 — 👍 1 🔁 0 💬 1 📌 0

What is the conclusion from GraphDOP being so far away from SotA? Is the setup still suboptimal in some way or is pure obs-based forecasting harder than some might have thought.

23.12.2024 16:46 — 👍 1 🔁 0 💬 1 📌 0

ECMWF with two new papers right before christmas.

AIFS-CRPS: arxiv.org/abs/2412.158...
GraphDOP (the first truly end2end global weather model): arxiv.org/abs/2412.15687

Here they are added to the SotA tracker: docs.google.com/spreadsheets...

23.12.2024 16:46 — 👍 21 🔁 6 💬 4 📌 2

Neural general circulation models optimized to predict satellite-based precipitation observations Climate models struggle to accurately simulate precipitation, particularly extremes and the diurnal cycle. Here, we present a hybrid model that is trained directly on satellite-based precipitation obs...

Can incorporating AI improve precipitation in global weather and climate models?

Yes! In the latest NeuralGCM paper, we show that training on satellite-based precipitation results in significant improvements over traditional atmospheric models:
arxiv.org/abs/2412.11973

19.12.2024 20:34 — 👍 34 🔁 5 💬 1 📌 0

🌎

20.11.2024 18:10 — 👍 1 🔁 1 💬 0 📌 0

👋

19.11.2024 07:03 — 👍 9 🔁 1 💬 2 📌 0

Posts by Stephan Rasp (@raspstephan.bsky.social)