## Machine Learning: complement or replacement of Numerical Weather Prediction?

Humans have tried, for millennia, to predict the weather by finding physical relationships between observed weather events, a notable example being the descent in barometric pressure used as an indicator of an upcoming precipitation event. It should come as no surprise that one of the first weather measuring instrument to be invented was the barometer, by Torricelli (see in Fig. 1 a replica of the first Torricelli barometer), nearly concurrently with a reliable thermometer. Only two hundred years later, the development of the electric telegraph allowed for a nearly instant exchange of weather data, leading to the creation of the first synoptic weather maps in the US, followed by Europe. Synoptic maps allowed amateur and professional meteorologists to look at patterns between weather data in an unprecedented effective way for the time, allowing the American meteorologists Redfield and Epsy to resolve the dispute on which way the air flowed in a hurricane (anticlockwise in the Northern Hemisphere).

By the beginning of the 20th century many countries around the globe started to exchange data daily (thanks to the recently laid telegraphic cables) leading to the creation of global synoptic maps, with information in the upper atmosphere provided by radiosondes, aeroplanes, and in the 1930s radars. By then, weather forecasters had developed a large set of experimental and statistical rules on how to compute the changes to daily synoptic weather maps looking at patterns between historical sets of synoptic daily weather maps and recorded meteorological events, but often, prediction of events days in advance remained challenging.

In 1954, a powerful tool became available to humans to objectively compute changes on the synoptic map over time: Numerical Weather Prediction models. NWPs solve numerically the primitive equations, a set of nonlinear partial differential equations that approximate the global atmospheric flow, using as initial conditions a snapshot of the state of the atmosphere, termed analysis, provided by a variety of weather observations. The 1960s, marked by the launch of the first satellites, enabled 5-7 days global NWP forecasts to be performed. Thanks to the work of countless scientists over the past 40 years, global NWP models, running at a scale of about 10km, can now simulate skilfully and reliably synoptic-scale and meso-scale weather patterns, such as high-pressure systems and midlatitude cyclones, with up to 10 days of lead time [3].

The relatively recent adoption of limited-area convection-permitting models (Fig. 2) has made possible even the forecast of local details of weather events. For example, convection-permitting forecasts of midlatitude cyclones can accurately predict small-scale multiple slantwise circulations, the 3-D structure of convection lines, and the peak cyclone surface wind speed [4].

However, physical processes below convection permitting resolution, such as wind gusts, that present an environmental risk to lives and livelihoods, cannot be explicitly resolved, but can be derived from the prognostic fields such as wind speed and pressure. Alternative techniques, such as statistical modelling (Malone model), haven’t yet matched (and are nowhere near to) the power of numerical solvers of physical equations in simulating the dynamics of the atmosphere in the spatio-temporal dimension.

NWPs are not without flaws, as they are affected by numerical drawbacks: errors in the prognostic atmospheric fields build up through time, as shown in Fig. 3, reaching a comparable forecast error to that of a persisted forecast, i.e. at each time step the forecast is constant, and of a climatology-based forecast, i.e. mean based on historical series of observations/model outputs. Errors build up because NWPs iteratively solve the primitive equations approximating the atmospheric flow (either by finite differences or spectral methods). Sources of these errors are: too coarse model resolution (which leads to incorrect representation of topography), long integration time steps, and small-scale/sub-grid processes which are unresolved by the model physics approximations. Errors in parametrisations of small-scale physical processes grow over time, leading to significant deterioration of the forecast quality after 48h. Therefore, high-fidelity parametrisations of unresolved physical processes are critical for an accurate simulation of all types of weather events.

Another limitation of NWPs is the difficulty in simulating the chaotic nature of weather, which leads to errors in model initial conditions and model physics approximations that grow exponentially over time. All these limitations, combined with instability of the atmosphere at the lower and upper bound, make the forecast of rapidly developing events such as flash floods particularly challenging to predict. A further weakness of NWP forecasts is that they rely on the use of an expensive High Parallel Computing (HPC) facility (Fig. 4), owned by a handful of industrialised nations, which run coarse scale global models and high-resolution convection-permitting forecasts on domains covering area of corresponding national interest. As a result, a high resolution prediction of weather hazards, and climatological analysis remains off-limits for the vast majority of developing and third-world countries, with detrimental effects not just on first line response to weather hazards, but also for the development of economic activities such agriculture, fishing, and renewable energies in a warming climate. In the last decade, the cloud computing technological revolution led to a tremendous increase in the availability and shareability of weather data sets, which transitioned from local storage and processing to network-based services managed by large cloud computing companies, such as Amazon, Microsoft or Google, through their distributed infrastructure.

Combined with the wide availability of their cloud computing facilities, the access to weather data has become more and more democratic and ubiquitous, and consistently less dependent on HPC facilities owned by National Agencies. This transformation is not without drawbacks in case these tech giants decide to close the taps of the flow of data. During a row with the Australian government, Facebook banned access to Australian news content in Feb 2021. Although by accident, also government-related agencies such as the Bureau of Meteorology were banned, leaving citizens with restricted access to important weather information until the pages were restored. It is hoped that with more companies providing distributed infrastructure, the accessibility to vital data for citizen security will become more resilient.

The exponential accessibility of weather data sets has stimulated the development and the application of novel machine learning algorithms. As a result, weather scientists worldwide can crunch increasingly effectively multi-dimensional weather data, ultimately providing a new powerful paradigm to understand and predict the atmospheric flow based on finding relationships between the available large-scale weather datasets.

Machine learning (ML) finds meaningful representations of the patterns between the data through a series of nonlinear transformations of the input data. ML pattern recognition is distinguished into two types: supervised and unsupervised learning.

Supervised Learning is concerned with predicting an output for a given input. It is based on learning the relationship between inputs and outputs, using training data consisting in example input/output pairs, being divided into regression or classification, depending on the type of the output variable to be predicted (discrete or continuous). Support Vector Machine (SVM) or Regression (SVR), Artificial Neural Network (ANN, with the feed-forward step shown in Fig. 5), and Convolutional Neural Network (CNN) are examples of supervised learning.

Unsupervised learning is the task of finding patterns within data without the presence of any ground truth or labelling of the data, with a common unsupervised learning task being clustering (group of data points that are close to one another, relative to data points outside the cluster). Examples of unsupervised learning are the K-means and K-Nearest Neighbour (KNN) algorithms [7].

So far, ML algorithms have been applied to four key problems in weather prediction:

1. Correction of systematic error in NWP outputs, which involves post-processing data to remove biases [8]
1. Assessment of the predictability of NWP outputs, evaluating the uncertainty and confidence scores of ensemble forecasting [9]
1. Extreme detection, involving prediction of severe weather such as hail, gust or cyclones [10]
1. NWP parametrizations, replacing empirical models for radiative transfer or boundary-layer turbulence with ML techniques [11]

The first key problem, which concerns the correction of systematic error in NWPs, is the most popular area of application of ML methods in meteorology. In this field, wind speed and precipitation observational data are often used to perform an ML linear regression on the NWP data with the end goal of enhancing its accuracy and resolving local details of the weather which were unresolved by NWP forecasts. Although attractive for its simplicity and robustness, linear regression presents two problems: (1) least-square methods used to solve linear regression do not scale well with the size of datasets (since matrix inversion required by least square is increasingly expensive for increasing datasets size), (2) Many relationships between variables of interest are nonlinear. Instead, classification tree-based methods have proven very useful to model non-linear weather events, from thunderstorm and turbulence detection to extreme precipitation events, and the representation of the circular nature of the wind. In fact, compared to linear regression, random trees exhibit an easy scalability with large-size datasets which have several input variables. Besides preserving the scalability to large datasets of tree-based method, ML methods such as ANN and SVM/R provide also a more generic and more powerful alternative for nonlinear-processes modelling. These improvements have come at the cost of a difficult interpretation of the underlying physical concepts that the model can identify, which is critical given that scientists need to couple these ML models with physical-equations based NWP for variable interdependence. As a matter of fact, it has proven challenging to interpret the physical meaning of the weights and nonlinear activation functions that describe in the ANN model the data patterns and relationships found by the model [12].

The second key problem, represented by the interpretation of ensemble forecasts, is being addressed by ML unsupervised learning methods such as clustering, which represents the likelihood of a forecast aggregating ensemble members by similarity. Examples include grouping of daily weather phenomena into synoptic types, defining weather regimes from upper air flow patterns, and grouping members of forecast ensembles [13].

The third key problem, which concerns the prediction of weather extremes, corresponding to weather phenomena which are a hazard to lives and economic activities, ML based methods tend to underestimate these events. The problem here lies with imbalanced datasets, since extreme events represent only a very small fraction of the total events observed [14].

The fourth key problem to which ML is currently being applied, is parametrisation. Completely new stochastic ML approaches have been developed, and their effectiveness, along with their simplicity compared to traditional empirical models has highlighted promising future applications in (moist) convection [15]

Further applications of ML methods are currently limited by intrinsic problems affecting the ML methods in relation to the challenges posed by weather data sets. While the reduction of the dimensionality of the data by ML techniques has proven highly beneficial for image pattern recognition in the context of weather data, it leads to a marked simplification of the input weather data, since it constrains the input space to individual grid cells in space or time [16]. The recent expansion of ANN into deep learning has provided new methodologies that can address these issues. This has pushed further the capability of ML models within the weather forecast domain, with CNNs providing a methodology for extracting complex patterns from large, structured datasets have been proposed, an example being the CNN model developed by Yunjie Liu in 2016 [17] to classify atmospheric rivers from climate datasets (atmospheric rivers are an important physical process for prediction of extreme rainfall events).

At the same time, Recursive Neural Networks (RNN), developed for natural language processing, are improving nowcasting techniques exploiting their excellent ability to work with the temporal dimension of data frames. CNN and RNN have now been combined, as illustrated in Fig. 6, providing the first nowcasting method in the context of precipitation, using radar data frames as input [18].

While these results show a promising application of ML models to a variety of weather prediction tasks which extend beyond the area of competence of traditional NWPs, such as analysis of ensemble clustering, bias correction, analysis of climate data sets and nowcasting, they also show that ML models are not ready to replace NWP to forecast synoptic-scale and mesoscale weather patterns.

As a matter of fact, NWPs have been developed and improved for over 60 years with the very purpose to simulate very accurately and reliably the wind, pressure, temperature and other relevant prognostic fields, so it would be unreasonable to expect ML models to outperform NWPs on such tasks.

It is also true that, as noted earlier, the amount of available data will only grow in the coming decades, so it will be critical as well as strategic to develop ML models capable to extract patterns and interpret the relationships within such data sets, complementing NWP capabilities. But how long before an ML model will be capable to replace an NWP by crunching the entire set of historical observations of the atmosphere, extracting the patterns and the spatial-temporal relationships between the variables, and then performing weather forecasts?

Acknowledgement: I would like to thank my colleagues and friends Brian Lo, James Fallon, and Gabriel M. P. Perez, for reading and providing feedback on this article.

References

1. https://collection.sciencemuseumgroup.org.uk/objects/co54518/replica-of-torricellis-first-barometer-1643-barometer-replica
1. https://www.semanticscholar.org/paper/High-resolution-numerical-weather-prediction-(NWP)-Allan-Bryan/a40e0ebd388b915bdd357f398baa813b55cef727/figure/6
1. Buizza, R., Houtekamer, P., Pellerin, G., Toth, Z., Zhu, Y. and Wei, M. (2005) A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon Weather Rev, 133, 1076 – 1097
1. Lean, H. and Clark, P. (2003) The effects of changing resolution on mesocale modelling of line convection and slantwise circulations in FASTEX IOP16. Q J R Meteorol Soc, 129, 2255–2278
1. Bishop, C., and Christopher, M., Pattern Recognition and Machine Learning, Springer
1. https://www.arup.com/projects/met-office-high-performance-computer
1. J. L. Aznarte and N. Siebert, “Dynamic Line Rating Using Numerical Weather Predictions and Machine Learning: A Case Study,” in IEEE Transactions on Power Delivery, vol. 32, no. 1, pp. 335-343, Feb. 2017, doi: 10.1109/TPWRD.2016.2543818.
1. Foley, Aoife M et al. (2012). “Current methods and advances in forecasting of wind power generation”. In: Renewable Energy 37.1, pp. 1–8.
1. McGovern, Amy et al. (2017). “Using artificial intelligence to improve real-time decision making for high-impact weather”. In: Bulletin of the American Meteorological Society 98.10, pp. 2073–2090
1. O’Gorman, Paul A and John G Dwyer (2018). “Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change and extreme events”. In: arXiv preprint arXiv:1806.11037
1. Moghim, Sanaz and Rafael L Bras (2017). “Bias correction of climate modeled temperature and precipitation using artificial neural networks”. In: Journal of Hydrometeorology 18.7, pp. 1867–1884.
1. Camargo S J, Robertson A W Gaffney S J Smyth P and M Ghil (2007). “Cluster analysis of typhoon tracks. Part I: General properties”. In: Journal of Climate 20.14, pp. 3635–3653.
1. Ahijevych, David et al. (2009). “Application of spatial verification methods to idealized and NWP-gridded precipitation forecasts”. In: Weather and Forecasting 24.6, pp. 1485–1497.
1. Berner, Judith et al. (2017). “Stochastic parameterization: Toward a new view of weather and climate models”. In: Bulletin of the American Meteorological Society 98.3, pp. 565–588.
1. Fan, Wei and Albert Bifet (2013). “Mining big data: current status, and forecast to the future”. In: ACM sIGKDD Explorations Newsletter 14.2, pp. 1–5
1. Liu, Yunjie et al. (2016). “Application of deep convolutional neural networks for detecting extreme weather in climate datasets”. In: arXiv preprint arXiv:1605.01156.
1. Xingjian, SHI et al. (2015). “Convolutional LSTM network: A machine learning approach for precipitation nowcasting”. In: Advances in neural information processing systems, pp. 802–810.

## Diagnosing solar wind forecast errors

The solar wind is a continual outflow of charged particles that comes off the Sun, ranging in speed from 250 to 800 km s-1. During the first six months of my PhD, I have been investigating the errors in a type of solar wind forecast that uses spacecraft observations, known as corotation forecasts. This was the topic of my first paper, where I focussed on extracting the forecast error that occurs due to a separation in the spacecraft latitude. I found that up to a latitudinal separation of 6 degrees, the error contribution was approximately constant. Above 6 degrees, the error contribution increases as the latitudinal separation increases. In this blog post I will explain the importance of forecasting the solar wind and the principle behind corotation forecasts. I will also explain how this work has wider implications for future space missions and solar wind forecasting.

The term “space weather” refers to the changing conditions in near-Earth space. Extreme space weather events can cause several effects on Earth, such as damaging power grids, disrupting communications, knocking out satellites and harming the health of humans in space or on high-altitude flights (Cannon, 2013). These effects are summarised in Figure 1. It is therefore important to accurately forecast space weather to help mitigate against these effects. Knowledge of the background solar wind is an important aspect of space weather forecasting as it modulates the severity of extreme events. This can be achieved through three-dimensional computer simulations or through more simple methods, such as corotation forecasts as discussed below.

Figure 1. Cosmic rays, solar energetic particles, solar flare radiation, coronal mass ejections and energetic radiation belt particles cause space weather. Subsequently, this produces a number of effects on Earth. Source: ESA.

Solar wind flow is mostly radial away from the Sun, however the fast/slow structure of the solar wind rotates round with the Sun. If you were looking down on the ecliptic plane (where the planets lie, at roughly the Sun’s equator), then you would see a spiral shape of fast and slow solar wind, as in Figure 2. This makes a full rotation in approximately 27 days. As this rotates around, it allows us to use observations on this plane as a forecast for a point further on in that rotation, assuming a steady-state solar wind (i.e., the solar wind does not evolve in time). For example, in Figure 2, an observation from the spacecraft represented by the red square could be used as a forecast at Earth (blue circle), some time later. This time depends on the longitudinal separation between the two points, as this determines the time it takes for the Sun to rotate through that angle.

Figure 2. The spiral structure of the solar wind, which rotates anticlockwise. Here, STA and STB are the STEREO-A and STEREO-B spacecraft respectively. The solar wind shown here is the radial component. Source: HUXt model (Owens et al, 2020).

In my recent paper I have been investigating how the corotation forecast error varies with the latitudinal separation of the observation and forecast points.  Latitudinal separation varies throughout the year, and it was theorised that it should have a significant impact on the accuracy of corotation forecasts. I used the two spacecraft from the STEREO mission, which are on the same plane as Earth, and a dataset for near-Earth. This allowed for six different configurations to compute corotation forecasts, with a maximum latitudinal separation of 14 degrees. I analysed the 18-month period from August 2009 to February 2011 to help eliminate other affecting variables. Figure 3 shows the relationship between forecast error and latitudinal separation. Up to approximately 6 degrees, there is no significant relationship between error and latitudinal separation. Above this, however, the error increases approximately linearly with the latitudinal separation.

Figure 3. Variation of forecast error with the latitudinal separation between the spacecraft making the observation and the forecast location. Error bars span one standard error on the mean.

This work has implications for the future Lagrange space weather monitoring mission, due for launch in 2027. The Lagrange spacecraft will be stationed in a gravitational null, 60degrees in longitude behind Earth on the ecliptic plane. Gravitational nulls occur when the gravitational fields between two or more massive bodies balance out. There are five of these nulls, called the Lagrange points, and locating a spacecraft at one reduces the amount of fuel needed to stay in position. The goal of the Lagrange mission is to provide a side-on view of the Sun-Earth line, but it also presents an opportunity for consistent corotation forecasts to be generated at Earth. However, the Lagrange spacecraft will oscillate in latitude compared to Earth, up to a maximum of about 5 degrees. My results indicate that the error contribution from latitudinal separation would be approximately constant.

The next steps are to use this information to help improve the performance of solar wind data assimilation. Data assimilation (DA) has led to large improvements in terrestrial weather forecasting and is beginning to be used in space weather forecasting. DA combines observations and model output to find an optimum estimation of reality. The latitudinal information found here can be used to inform the DA scheme how to better handle the observations and to, hopefully, produce an improved solar wind representation.

The work I have discussed here has been accepted into the AGU Space Weather journal and is available at https://agupubs.onlinelibrary.wiley.com/doi/epdf/10.1029/2021SW002802.

References

Cannon, P.S., 2013. Extreme space weather – A report published by the UK royal academy of engineering. Space Weather, 11(4), 138-139.  https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/swe.20032

Owens, M.J., Lang, M.S., Barnard, L., Riley, P., Ben-Nun, M., Scott, C.J., Lockwood, M., Reiss, M.A., Arge, C.N. & Gonzi, S., 2020. A Computationally Efficient, Time-Dependent Model of the Solar Wind for use as a Surrogate to Three-Dimensional Numerical Magnetohydrodynamic Simulations. Solar Physics, 295(3), https://doi.org/10.1007/s11207-020-01605-3

## Connecting Global to Local Hydrological Modelling Forecasting – Virtual Workshop

ECMWF- CEMS – C3S – HEPEX – GFP

What was it?

The workshop was organised under the umbrella of ECMWF, the Copernicus services CEMS and C3S, the Hydrological Ensemble Prediction EXperiment (HEPEX) and the Global Flood Partnership (GFP). The workshop lasted 3 days, with a keynote speaker followed by Q&A at the start of each of the 6 sessions. Each keynote talk focused on a different part of the forecast chain, from hybrid hydrological forecasting to the use of forecasts for anticipatory humanitarian action, and how the global and local hydrological scales could be linked. Following this were speedy poster pitches from around the world and poster presentations and discussion in the virtual ECMWF (Gather.town).

Gwyneth – I presented Evaluating the post-processing of the European Flood Awareness System’s medium-range streamflow forecasts in Session 2 – Catchment-scale hydrometeorological forecasting: from short-range to medium-range. My poster showed the results of the recent evaluation of the post-processing method used in the European Flood Awareness System. Post-processing is used to correct errors and account for uncertainties in the forecasts and is a vital component of a flood forecasting system. By comparing the post-processed forecasts with observations, I was able to identify where the forecasts were most improved.

Helen – I presented An evaluation of ensemble forecast flood map spatial skill in Session 3 – Monitoring, modelling and forecasting for flood risk, flash floods, inundation and impact assessments. The ensemble approach to forecasting flooding extent and depth is ideal due to the highly uncertain nature of extreme flooding events. The flood maps are linked directly to probabilistic population impacts to enable timely, targeted release of funding. The Flood Foresight System forecast flood inundation maps are evaluated by comparison with satellite based SAR-derived flood maps so that the spatial skill of the ensemble can be determined.

What did you find most interesting at the workshop?

Gwyneth – All the posters! Every session had a wide range of topics being presented and I really enjoyed talking to people about their work. The keynote talks at the beginning of each session were really interesting and thought-provoking. I especially liked the talk by Dr Wendy Parker about a fitness-for-purpose approach to evaluation which incorporates how the forecasts are used and who is using the forecast into the evaluation.

Helen – Lots! All of the keynote talks were excellent and inspiring. The latest developments in detecting flooding from satellites include processing the data using machine learning algorithms directly onboard, before beaming the flood map back to earth! If openly available and accessible (this came up quite a bit) this will potentially rapidly decrease the time it takes for flood maps to reach both flood risk managers dealing with the incident and for use in improving flood forecasting models.

How was your virtual poster presentation/discussion session?

Gwyneth – It was nerve-racking to give the mini-pitch to +200 people, but the poster session in Gather.town was great! The questions and comments I got were helpful, but it was nice to have conversations on non-research-based topics and to meet some of the EC-HEPEXers (early career members of the Hydrological Ensemble Prediction Experiment). The sessions felt more natural than a lot of the virtual conferences I have been to.

Helen – I really enjoyed choosing my hairdo and outfit for my mini self. I’ve not actually experienced a ‘real’ conference/workshop but compared to other virtual events this felt quite realistic. I really enjoyed the Gather.town setting, especially the duck pond (although the ducks couldn’t swim or quack! J). It was great to have the chance talk about my work and meet a few people, some thought-provoking questions are always useful.

## Forecasting space weather using “similar day” approach

Space weather is a natural threat that requires good quality forecasting with as much lead time as possible. In this post I outline the simple and understandable analogue ensemble (AnEn) or “similar day” approach to forecasting. I focus mainly on exploring the method itself and, although this work forecasts space weather through a timeseries of ground level observations, AnEn can be applied to many prediction tasks, particularly time series with strong auto-correlation. AnEn has previously been used to predict wind speed [1], temperature [1] and solar wind [2]. The code for AnEn is available at https://github.com/Carl-Haines/AnalogueEnsemble should you wish to try out the method for you own application.

The idea behind AnEn is to take a set of recent observations, look back in a historic dataset for analogous periods, then take what happened following those analogous periods as the forecast. If multiple analogous periods are used, then an ensemble of forecasts can be created giving a distribution of possible outcomes with probabilistic information.

Figure 1 – An example of AnEn applied to a space weather event with forecast time t0. The black line shows the observations, the grey line shows the ensemble members, the red line shows the median of the ensemble and the yellow and green lines are reference forecasts.

Figure 1 is an example of a forecast made using the AnEn method where the forecast is made at t0. The 24-hours of observations (black) prior to tare matched to similar periods in the historic dataset (grey). Here I have chosen to give the most recent observations the most weighting as they hold the most relevant information. The grey analogue lines then flow on after t0 forming the forecast. Combined, these form an ensemble and the median of these is shown in red. The forecast can be chosen to be the median (or any percentile) of the ensemble or a probability of an event occurring can be given by counting how many of the ensemble member do/don’t experience the event.

Figure 1 also shows two reference forecasts, namely 27-day recurrence and climatology, as benchmarks to beat. 27-day recurrence uses the observation from 27-days ago as the forecast for today. This is reasonable because the Sun rotates every 27-days as seen from earth so broadly speaking the same part of the Sun is emitting the relevant solar wind on timescales larger than 27-days.

To quantify how well AnEn works as a forecast I ran the forecast on the entire dataset by repeatedly changing the forecast time t0 and applied two metrics, namely mean absolute error (MAE) and skill, to the median of the ensemble members. MAE is the size of the mean difference between the forecast made by AnEn and what was actually observed. The mean of the absolute errors over all the forecasts (taken as median of the ensemble) is taken and we end up with a value for each lead time. Figure 2 shows the MAE for AnEn median and the reference forecasts. We see that AnEn has the smallest (best) MAE at short lead times and outperforms the reference forecasts for all lead times up to a week.

Figure 2 – The mean absolute error of the AnEn median and reference forecasts.

An error metric such as MAE cannot take into account that certain conditions are inherently more difficult to forecast such as storm times. For this we can use a skill metric defined by

${\text{Skill} = 1 - \frac{\text{Forecast error}}{\text{Reference error}}}$

where in this case we use climatology as the reference forecast. Skill can take any value between $-\infty$ and $1$ where a perfect forecast would receive a value of $1$ and an unskilful forecast would receive a value of $0$. A negative value of skill signifies that the forecast is worse than the reference forecast.

Figure 3 shows the skill of AnEn and 27-day recurrence with respect to climatology. We see that AnEn is most skilful for short lead times and outperforms 27-day recurrence for all lead times considered.

Figure 3 – The skill of the AnEn median and 27-day recurrence with respect to climatology.

In summary, the analogue ensemble forecast method matches current conditions with historical events and lifts the previously seen timeseries as the prediction. AnEn seems to perform well for this application and outperforms the reference forecasts of climatology and 27-day recurrence. The code for AnEn is available at https://github.com/Carl-Haines/AnalogueEnsemble

The work presented here makes up a part of a paper that is under review in the journal of Space Weather.

Here, AnEn has been applied to a dataset from the space weather domain. If you would like to find out more about space weather then take a look at these previous blog posts from Shannon Jones (https://socialmetwork.blog/2018/04/13/the-solar-stormwatch-citizen-science-project/) and I (https://socialmetwork.blog/2019/11/15/the-variation-of-geomagnetic-storm-duration-with-intensity/).

[1] Delle Monache, L., Eckel, F. A., Rife, D. L., Nagarajan, B., & Searight, K.(2013) Probabilistic Weather Prediction with an Analog Ensemble. doi: 10.1175/mwr-d-12-00281.1

[2] Owens, M. J., Riley, P., & Horbury, T. S. (2017a). Probabilistic Solar Wind and Ge-704omagnetic Forecasting Using an Analogue Ensemble or “Similar Day” Approach. doi: 10.1007/s11207-017-1090-7

## Extending the predictability of flood hazard at the global scale

When I started my PhD, there were no global scale operational seasonal forecasts of river flow or flood hazard. Global overviews of upcoming flood events are key for organisations working at the global scale, from water resources management to humanitarian aid, and for regions where no other local or national forecasts are available. While GloFAS (the Global Flood Awareness System, run by the European Centre for Medium-Range Weather Forecasts (ECMWF) and the European Commission Joint Research Centre (JRC) as part of the Copernicus Emergency Management Services) was producing operational, openly-available flood forecasts out to 30 days ahead, there was a need for more extended-range forecast information. Often, due to a lack of hydrological forecasts, seasonal rainfall forecasts are used as a proxy for flood hazard – however, the link between precipitation and floodiness is nonlinear, and recent research has shown that seasonal rainfall forecasts are not necessarily the best indicator of potential flood hazard. The aim of my PhD research was to look into ways in which we could provide earlier warning information, several weeks to months ahead, using hydrological analysis in addition to the meteorology.

Broadly speaking, there are two key ways in which to provide early warning information on seasonal timescales: (1) through statistical analysis based on large-scale climate variability and teleconnections, and (2) by producing dynamical seasonal forecasts using coupled ocean-atmosphere GCMs. Over the past 4.5 years, I worked on providing hydrologically-relevant seasonal forecast products using these two approaches, at the global scale. This blog post will give a quick overview of the two new forecast products we produced as part of this research!

Can we use El Niño to predict flood hazard?

ENSO (the El Niño Southern Oscillation), is known to influence river flow and flooding across much of the globe, and often, statistical historical probabilities of extreme precipitation during El Niño and La Niña (the extremes of ENSO climate variability) are used to provide information on likely flood impacts. Due to its global influence on weather and climate, we decided to assess whether it is possible to use ENSO as a predictor of flood hazard at the global scale, by assessing the links between ENSO and river flow globally, and estimating the equivalent historical probabilities for high and low river flow, to those that are already used for meteorological variables.

With a lack of sufficient river flow observations across much of the globe, we needed to use a reanalysis dataset – but global reanalysis datasets for river flow are few and far between, and none extended beyond ~40 years (which includes a sample of ≤10 El Niños and ≤13 La Niñas). We ended up producing a 20th Century global river flow reconstruction, by forcing the Camaflood hydrological model with ECMWF’s ERA-20CM atmospheric reconstruction, to produce a 10-member river flow dataset covering 1901-2010, which we called ERA-20CM-R.

Using this dataset, we calculated the percentage of past El Niño and La Niña events, during which the monthly mean river flow exceeded a high flow threshold (the 75th percentile of the 110-year climatology) or fell below a low flow threshold (the 25th percentile), for each month of an El Niño / La Niña. This percentage is then taken as the probability that high or low flow will be observed in future El Niño/La Niña events. Maps of these probabilities are shown above, for El Niño, and all maps for both El Niño and La Niña can be found here. When comparing to the same historical probabilities calculated for precipitation, it is evident that additional information can be gained from considering the hydrology. For example, the River Nile in northern Africa is likely to see low river flow, even though the surrounding area is likely to see more precipitation – because it is influenced more by changes in precipitation upstream. In places that are likely to see more precipitation but in the form of snow, there would be no influence on river flow or flood hazard during the time when more precipitation is expected. However, several months later, there may be no additional precipitation expected, but there may be increased flood hazard due to the melting of more snow than normal – so we’re able to see a lagged influence of ENSO on river flow in some regions.

While there are locations where these probabilities are high and can provide a useful forecast of hydrological extremes, across much of the globe, the probabilities are lower and much more uncertain (see here for more info on uncertainty in these forecasts) than might be useful for decision-making purposes.

Providing openly-available seasonal river flow forecasts, globally

For the next ‘chapter’ of my PhD, we looked into the feasibility of providing seasonal forecasts of river flow at the global scale. Providing global-scale flood forecasts in the medium-range has only become possible in recent years, and extended-range flood forecasting was highlighted as a grand challenge and likely future development in hydro-meteorological forecasting.

To do this, I worked with Ervin Zsoter at ECMWF, to drive the GloFAS hydrological model (Lisflood) with reforecasts from ECMWF’s latest seasonal forecasting system, SEAS5, to produce seasonal forecasts of river flow. We also forced Lisflood with the new ERA5 reanalysis, to produce an ERA5-R river flow reanalysis with which to initialise Lisflood, and to provide a climatology. The system set-up is shown in the flowchart below.

I also worked with colleagues at ECMWF to design forecast products for a GloFAS seasonal outlook, based on a combination of features from the GloFAS flood forecasts, and the EFAS (the European Flood Awareness System) seasonal outlook, and incorporating feedback from users of EFAS.

After ~1 year of working on getting the system set up and finalising the forecast products, including a four-month research placement at ECMWF, the first GloFAS -Seasonal forecast was released in November 2017, with the release of SEAS5. GloFAS-Seasonal is now running operationally at ECMWF, providing forecasts of high and low weekly-averaged river flow for the global river network, up to 4 months ahead, with 3 new forecast layers available through the GloFAS interface. These provide a forecast overview for 307 major river basins, a map of the forecast for the entire river network at the sub-basin scale, and ensemble hydrographs at thousands of locations across the globe (which change with each forecast depending on forecast probabilities). New forecasts are produced once per month, and released on the 10th of each month. You can find more information on each of the different forecast layers and the system set-up here, and you can access the (openly available) forecasts here. ERA5-R, ERA-20CM-R and the GloFAS-Seasonal reforecasts are also all freely available – just get in touch! GloFAS-Seasonal will continue to be developed by ECMWF and the JRC, and has already been updated to v2.0, including a calibrated version of the hydrological model.

So, over the course of my PhD, we developed two new seasonal forecasts for hydrological extremes, at the global scale. You may be wondering whether they’re skilful, or in fact, which one provides the most useful forecasts! For information on the skill or ‘potential usefulness’ of GloFAS-Seasonal, head to our paper, and stay tuned for a paper coming soon (hopefully! [update: this paper has just been accepted and can be accessed online here]) on the ‘most useful approach for forecasting hydrological extremes during El Niño’, in which we compare the skill of the two forecasts at predicting observed high and low flow events during El Niño.

With thanks to my PhD supervisors & co-authors:

Hannah Cloke1, Liz Stephens1, Florian Pappenberger2, Steve Woolnough1, Ervin Zsoter2, Peter Salamon3, Louise Arnal1,2, Christel Prudhomme2, Davide Muraro3

1University of Reading, 2ECMWF, 3European Commission Joint Research Centre

## The Circumglobal Teleconnection and its Links to Seasonal Forecast Skill for the European Summer

Recent extreme weather events such as the central European heatwave in 2003, flooding in the UK in 2007, and even the recent dry summer in the UK in 2018, have highlighted the need for more accurate long-range forecasts for the European summer. Recent research has led to improvements in European winter seasonal forecasts, however summer forecast skill remains relatively low. One potential source of predictability for Europe is the Indian summer monsoon, which can affect European weather via a global wave train known as the “Circumglobal Teleconnection” (CGT).

The CGT was first identified by Ding and Wang (2005) as having a major role in modulating observed weather patterns in the Northern Hemisphere summer. Using a 200 hPa geopotential height index centred in west-central Asia (35°-40°N, 60°-70°E), they constructed a one-point correlation map of geopotential height with reference to this index (reproduced in Figure 1). From this, they identified a wavenumber-5 structure where the pressure variations over the Northeast Atlantic, East Asia, North Pacific and North America are all nearly in phase with the variations over west-central Asia (these are known as the “centres of action”). They also showed that the CGT is associated with significant temperature and precipitation anomalies in Europe, so accurate representation this mechanism in seasonal forecast models could provide an important source of subseasonal to seasonal forecast skill.

The model used here is a version of the European Centre for Medium-Range Weather Forecasts (ECMWF)’s coupled seasonal forecast model. Reforecasts are initialised on 1st May and are run for four months, so cover May-August, with start dates from 1981-2014. The skill of the model 200 hPa geopotential height is shown in Figure 2, defined as the correlation between the model ensemble mean and ERA-Interim. The model has good skill in May (to be expected given that the reforecasts are initialised in May) but in June, July and August areas of zero or negative correlation develop across much of the northern hemisphere extratropics. The areas of reduced skill align closely with the location of the centres of action of the CGT shown in Figure 1, suggesting that there is a link between the model skill and the model representation of the CGT.

To determine how well the model represents the CGT, Figure 3 shows the correlation between the D&W region and the other centres of action of the CGT, as defined in Figure 1. Focussing on August (as August has the strongest CGT pattern) it can be seen that the model correlations, indicated by the box and whisker plots, are weaker than in observations (red diamond) for the D&W vs. North Pacific (NPAC), North America (NAM) and Northwest Europe (NWEUR) regions. This indicates that the model has a weak representation of the wavetrain associated with the CGT.

There are likely to be several reasons for the weak representation of the CGT in the model. One important factor is the presence of a northerly jet bias in the model across much of the Northern Hemisphere. This can be seen in Figure 4, which shows the model jet biases relative to ERA-Interim in the coloured contours, and the observed zonal wind in the black contours. The dipole structure of the biases which exists across much of the hemisphere, particularly in June, July and August, indicates that the model jet stream is located too far to the north. This means that Rossby waves forced in this region will have different wave propagation characteristics to reality – they may propagate at the incorrect speed, in the wrong direction or may not propagate at all, and this is likely to be an important factor in the weak representation of the CGT in the model.

Other potential factors involved are a poor representation of the link between monsoon precipitation and the geopotential height in west-central Asia (which was shown by Ding and Wang (2007) to be important in the maintenance of the CGT) and errors in the forcing of Rossby waves associated with the monsoon. For a more detailed explanation of these, see my paper in Climate Dynamics (Beverley et al. 2018). It seems likely that the pattern of reduced skill in Figure 2, with negative correlations located at the centres of action of the CGT, including over Europe, is related to the poor representation of the CGT in the model. This raises the question of whether an improvement in the model’s representation of the CGT would lead to an improvement in forecast skill for the European summer. To address this question, sensitivity experiments have been carried out, in which the observed circulation is imposed in several centres of action along the CGT pathway to explore the impact on forecast skill for European summer weather.

References

Beverley, J. D., S. J. Woolnough, L. H. Baker, S. J. Johnson and A. Weisheimer, 2018: The northern hemisphere circumglobal teleconnection in a seasonal forecast model and its relationship to European summer forecast skill. Clim. Dyn. https://doi.org/10.1007/s00382-018-4371-4

Ding, Q., and B. Wang, 2005: Circumglobal teleconnection in the northern hemisphere summer. J. Clim. 18, 3483–3505.

Ding, Q., and B. Wang, 2007: Intraseasonal teleconnection between the summer Eurasian wave train and the Indian monsoon. J. Clim. 20, 3751-3767. https://doi.org/10.1175/JCLI4221.1

## APPLICATE General Assembly and Early Career Science event

On 28th January to 1st February I attended the APPLICATE (Advanced Prediction in Polar regions and beyond: modelling, observing system design and LInkages associated with a Changing Arctic climaTE (bold choice)) General Assembly and Early Career Science event at ECMWF in Reading. APPLICATE is one of the EU Horizon 2020 projects with the aim of improving weather and climate prediction in the polar regions. The Arctic is a region of rapid change, with decreases in sea ice extent (Stroeve et al., 2012) and changes to ecosystems (Post et al., 2009). These changes are leading to increased interest in the Arctic for business opportunities such as the opening of shipping routes (Aksenov et al., 2017). There is also a lot of current work being done on the link between changes in the Arctic and mid-latitude weather (Cohen et al., 2014), however there is still much uncertainty. These changes could have large impacts on human life, therefore there needs to be a concerted scientific effort to develop our understanding of Arctic processes and how this links to the mid-latitudes. This is the gap that APPLICATE aims to fill.

The overarching goal of APPLICATE is to develop enhanced predictive capacity for weather and climate in the Arctic and beyond, and to determine the influence of Arctic climate change on Northern Hemisphere mid-latitudes, for the benefit of policy makers, businesses and society.

APPLICATE Goals & Objectives

Attending the General Assembly was a great opportunity to get an insight into how large scientific projects work. The project is made up of different work packages each with a different focus. Within these work packages there are then a set of specific tasks and deliverables spread out throughout the project. At the GA there were a number of breakout sessions where the progress of the working groups was discussed. It was interesting to see how these discussions worked and how issues, such as the delay in CMIP6 experiments, are handled. The General Assembly also allows the different work packages to communicate with each other to plan ahead, and for results to be shared.

One of the big questions APPLICATE is trying to address is the link between Arctic sea-ice and the Northern Hemisphere mid-latitudes. Many of the presentations covered different aspects of this, such as how including Arctic observations in forecasts affects their skill over Eurasia. There were also initial results from some of the Polar Amplification (PA)MIP experiments, a project that APPLICATE has helped coordinate.

At the end of the week there was the Early Career Science Event which consisted of a number of talks on more soft skills. One of the most interesting activities was based around engaging with stakeholders. To try and understand the different needs of a variety of stakeholders in the Arctic (from local communities to shipping companies) we had to try and lobby for different policies on their behalf. This was also a great chance to meet other early career scientists working in the field and get to know each other a bit more.

What a difference a day makes, heavy snow getting the ECMWF’s ducks in the polar spirit.

#### References

Aksenov, Y. et al., 2017. On the future navigability of Arctic sea routes: High-resolution projections of the Arctic Ocean and sea ice. Marine Policy, 75, pp.300–317.

Cohen, J. et al., 2014. Recent Arctic amplification and extreme mid-latitude weather. Nature Geoscience, 7(9), pp.627–637.

Post, E. & Others, 24, 2009. Ecological Dynamics Across the Arctic Associated with Recent Climate Change. Science, 325(September), pp.1355–1358.

Stroeve, J.C. et al., 2012. Trends in Arctic sea ice extent from CMIP5, CMIP3 and observations. Geophysical Research Letters, 39(16), pp.1–7.

## Evaluating aerosol forecasts in London

Aerosols in urban areas can greatly impact visibility, radiation budgets and our health (Chen et al., 2015). Aerosols make up the liquid and solid particles in the air that, alongside noxious gases like nitrogen dioxide, are the pollution in cities that we often hear about on the news – breaking safety limits in cities across the globe from London to Beijing. Air quality researchers try to monitor and predict aerosols, to inform local councils so they can plan and reduce local emissions.

Recently, large numbers of LiDARs (Light Detection and Ranging) have been deployed across Europe, and elsewhere – in part to observe aerosols. They effectively shoot beams of light into the atmosphere, which reflect off atmospheric constituents like aerosols. From each beam, many measurements of reflectance are taken very quickly over time – and as light travels further with more time, an entire profile of reflectance can be constructed. As the penetration of light into the atmosphere decreases with distance, the reflected light is usually commonly called attenuated backscatter (β). In urban areas, measurements away from the surface like these are sorely needed (Barlow, 2014), so these instruments could be extremely useful. When it comes to predicting aerosols, numerical weather prediction (NWP) models are increasingly being considered as an option. However, the models themselves are very computationally expensive to run so they tend to only have a simple representation of aerosol. For example, for explicitly resolved aerosol, the Met Office UKV model (1.5 km) just has a dry mass of aerosol [kg kg-1] (Clark et al., 2008). That’s all. It gets transported around by the model dynamics, but any other aerosol characteristics, from size to number, need to be parameterised from the mass, to limit computation costs. However, how do we know if the estimates of aerosol from the model are actually correct? A direct comparison between NWP aerosol and β is not possible because fundamentally, they are different variables – so to bridge the gap, a forward operator is needed.

In my PhD I helped develop such a forward operator (aerFO, Warren et al., 2018). It’s a model that takes aerosol mass (and relative humidity) from NWP model output, and estimates what the attenuated backscatter would be as a result (βm). From this, βm could be directly compared to βo and the NWP aerosol output evaluated (e.g. see if the aerosol is too high or low). The aerFO was also made to be computationally cheap and flexible, so if you had more information than just the mass, the aerFO would be able to use it!

Among the aerFO’s several uses (Warren et al., 2018, n.d.), was the evaluation of NWP model output. Figure 2 shows the aerFO in action with a comparison between βm and observed attenuated backscatter (βo) measured at 905 nm from a ceilometer (a type of LiDAR) on 14th April 2015 at Marylebone Road in London. βm was far too high in the morning on this day. We found that the original scheme the UKV used to parameterise the urban surface effects in London was leading to a persistent cold bias in the morning. The cold bias would lead to a high relative humidity, so consequently the aerFO condensed more water than necessary, onto the aerosol particles as a result, causing them to swell up too much. As a result, bigger particles mean bigger βm and an overestimation. Not only was the relative humidity too high, the boundary layer in the NWP model was developing too late in the day as well. Normally, when the surface warms up enough, convection starts, which acts to mix aerosol up in the boundary layer and dilute it near the surface. However, the cold bias delayed this boundary layer development, so the aerosol concentration near the surface remained high for too long. More mass led to the aerFO parameterising larger sizes and total numbers of particles, so overestimated βm. This cold bias effect was reflected across several cases using the old scheme but was notably smaller for cases using a newer urban surface scheme called MORUSES (Met Office – Reading Urban Surface Exchange Scheme). One of the main aims for MORUSES was to improve the representation of energy transfer in urban areas, and at least to us it seemed like it was doing a better job!

References

Barlow, J.F., 2014. Progress in observing and modelling the urban boundary layer. Urban Clim. 10, 216–240. https://doi.org/10.1016/j.uclim.2014.03.011

Chen, C.H., Chan, C.C., Chen, B.Y., Cheng, T.J., Leon Guo, Y., 2015. Effects of particulate air pollution and ozone on lung function in non-asthmatic children. Environ. Res. 137, 40–48. https://doi.org/10.1016/j.envres.2014.11.021

Clark, P.A., Harcourt, S.A., Macpherson, B., Mathison, C.T., Cusack, S., Naylor, M., 2008. Prediction of visibility and aerosol within the operational Met Office Unified Model. I: Model formulation and variational assimilation. Q. J. R. Meteorol. Soc. 134, 1801–1816. https://doi.org/10.1002/qj.318

Warren, E., Charlton-Perez, C., Kotthaus, S., Lean, H., Ballard, S., Hopkin, E., Grimmond, S., 2018. Evaluation of forward-modelled attenuated backscatter using an urban ceilometer network in London under clear-sky conditions. Atmos. Environ. 191, 532–547. https://doi.org/10.1016/j.atmosenv.2018.04.045

Warren, E., Charlton-Perez, C., Kotthaus, S., Marenco, F., Ryder, C., Johnson, B., Lean, H., Ballard, S., Grimmond, S., n.d. Observed aerosol characteristics to improve forward-modelled attenuated backscatter. Atmos. Environ. Submitted