ECMWF Newsletter #179

EUPPBench: A forecast dataset to benchmark statistical postprocessing methods

Cristina Primo (Deutscher Wetterdienst)
Zied Ben Bouallègue (ECMWF)
Jonas Bhend (MeteoSwiss)
Sebastian Lerch (Karlsruhe Institute of Technology)
Bert Van Schaeybroeck (Royal Meteorological Institute of Belgium)
Jonathan Demaeyer (Royal Meteorological Institute of Belgium and EUMETNET)

 

During the last couple of years, several open-data and high-quality datasets have been published and made available online. An example is the WeatherBench initiative for data-driven medium-range weather forecasting. Such a benchmark dataset is useful to speed up progress in a field of research by enabling a fair comparison of different methods developed for the same goal. These datasets are particularly convenient when large amounts of data are required to accomplish a task, and the endeavour of building a dataset can benefit a whole community. Statistical postprocessing – here the process of correcting systematic errors of medium-range weather forecasts – might also benefit from a benchmark dataset, since it is a data-intensive task by definition: based on historical data of forecasts and observations, one aims to learn from past errors to improve future forecasts. Therefore, as part of the activities of the European National Meteorological Services Network (EUMETNET), the module on statistical postprocessing of weather forecasts, EUMETNET PP, has developed a benchmark dataset for statistical postprocessing named EUPPBench. The aim is to facilitate the comparison of rapidly evolving postprocessing techniques, especially new machine learning (ML) methods.

First benchmark experiment.
First benchmark experiment. Example forecast evolution over seven days in December 2017, illustrating the extent of the EUPPBench dataset and of the first benchmark experiment (indicated by the box and the points in the left panel), along with example forecast time series (colours) and verifying observations (dashed lines) from the first benchmark experiment for Koksijde (top panels) and Säntis (bottom panels).

Contribution from ECMWF and NMHSs

EUPPBench is based on ECMWF Integrated Forecasting System (IFS) forecasts for the years 2017–2018 and corresponding reforecasts for 1997–2018, over Western Europe. Time-aligned observations from national meteorological and hydrological services (NMHSs) are available on a grid and at station points. The observations cover 22 years to match the reforecasts, which span 20 years in the past and 2 years for testing and validation. More than 40 meteorological variables are included in the benchmark dataset to allow experimenting with different targets and predictors as input for the postprocessing methods. Not only variables at the surface and at pressure levels are included in the dataset, but also more sophisticated ones, such as the Extreme Forecast Index (EFI). The data are stored in the Zarr format on ECMWF’s and EUMETSAT’s European Weather Cloud (EWC). A specifically designed climetlab plugin was developed for easy access to the data.

Results

Can we improve 2 m temperature ensemble forecasts at station locations with postprocessing? This simple challenge served as a basis to demonstrate how EUPPbench can be used for research. Different state‑of-the‑art methods for ensemble postprocessing were tested and their performance was compared using a set of verification metrics. This first comparative study was simple in the sense that the same variable was used both as sole predictor and as a target of the postprocessing methods. So, in our case, the goal was to predict 2 m temperature using only 2 m temperature forecasts (and metadata information) as predictors.

We learnt a number of lessons while generating and assessing this first benchmark challenge. Of course, we learnt about the methods themselves and their performance for postprocessing of ensemble forecasts at station locations. But we also gained insight into the whole process of building a benchmark dataset and how to approach this complicated exercise. It is worth mentioning that this benchmarking activity also provided a collaborative platform for discussions and exchange of ideas within the postprocessing community, spanning national meteorological services and research groups at universities.

The results of the inter-comparison exercise and the lessons learnt are published in Earth System Science Data (Demaeyer et al., 2023, https://doi.org/10.5194/essd-15-2635-2023). Not only the data (available on the EWC) but also the source code for generating postprocessed ensemble forecasts of 2 m temperature with a variety of methods are available through a GitHub repository, as indicated in the published article. Someone developing a new postprocessing approach can today compare their results with state‑of-the‑art methods using the EUPPbench framework.

Prospects

No doubt the lessons learnt during the first phase will prove valuable for the second phase of the EUMETNET PP module starting this year. The new ambition is to create a general benchmark for statistical postprocessing of precipitation ensemble forecasts, using high-resolution observations obtained from a collaboration with the European project RODEO. During this phase, we will thus seek to exploit all the information available in different predictors, and we will particularly encourage the testing of ML methods.