Machine Learning: complement or replacement of Numerical Weather Prediction? 

Emanuele Silvio Gentile –

Figure 1 Replica of the first 1643 Torricelli barometer [1]

Humans have tried, for millennia, to predict the weather by finding physical relationships between observed weather events, a notable example being the descent in barometric pressure used as an indicator of an upcoming precipitation event. It should come as no surprise that one of the first weather measuring instrument to be invented was the barometer, by Torricelli (see in Fig. 1 a replica of the first Torricelli barometer), nearly concurrently with a reliable thermometer. Only two hundred years later, the development of the electric telegraph allowed for a nearly instant exchange of weather data, leading to the creation of the first synoptic weather maps in the US, followed by Europe. Synoptic maps allowed amateur and professional meteorologists to look at patterns between weather data in an unprecedented effective way for the time, allowing the American meteorologists Redfield and Epsy to resolve the dispute on which way the air flowed in a hurricane (anticlockwise in the Northern Hemisphere).

Figure 2 High Resolution NWP – model concept [2]

By the beginning of the 20th century many countries around the globe started to exchange data daily (thanks to the recently laid telegraphic cables) leading to the creation of global synoptic maps, with information in the upper atmosphere provided by radiosondes, aeroplanes, and in the 1930s radars. By then, weather forecasters had developed a large set of experimental and statistical rules on how to compute the changes to daily synoptic weather maps looking at patterns between historical sets of synoptic daily weather maps and recorded meteorological events, but often, prediction of events days in advance remained challenging.

In 1954, a powerful tool became available to humans to objectively compute changes on the synoptic map over time: Numerical Weather Prediction models. NWPs solve numerically the primitive equations, a set of nonlinear partial differential equations that approximate the global atmospheric flow, using as initial conditions a snapshot of the state of the atmosphere, termed analysis, provided by a variety of weather observations. The 1960s, marked by the launch of the first satellites, enabled 5-7 days global NWP forecasts to be performed. Thanks to the work of countless scientists over the past 40 years, global NWP models, running at a scale of about 10km, can now simulate skilfully and reliably synoptic-scale and meso-scale weather patterns, such as high-pressure systems and midlatitude cyclones, with up to 10 days of lead time [3].

The relatively recent adoption of limited-area convection-permitting models (Fig. 2) has made possible even the forecast of local details of weather events. For example, convection-permitting forecasts of midlatitude cyclones can accurately predict small-scale multiple slantwise circulations, the 3-D structure of convection lines, and the peak cyclone surface wind speed [4].

However, physical processes below convection permitting resolution, such as wind gusts, that present an environmental risk to lives and livelihoods, cannot be explicitly resolved, but can be derived from the prognostic fields such as wind speed and pressure. Alternative techniques, such as statistical modelling (Malone model), haven’t yet matched (and are nowhere near to) the power of numerical solvers of physical equations in simulating the dynamics of the atmosphere in the spatio-temporal dimension.

Figure 3 Error growth over time [5]

NWPs are not without flaws, as they are affected by numerical drawbacks: errors in the prognostic atmospheric fields build up through time, as shown in Fig. 3, reaching a comparable forecast error to that of a persisted forecast, i.e. at each time step the forecast is constant, and of a climatology-based forecast, i.e. mean based on historical series of observations/model outputs. Errors build up because NWPs iteratively solve the primitive equations approximating the atmospheric flow (either by finite differences or spectral methods). Sources of these errors are: too coarse model resolution (which leads to incorrect representation of topography), long integration time steps, and small-scale/sub-grid processes which are unresolved by the model physics approximations. Errors in parametrisations of small-scale physical processes grow over time, leading to significant deterioration of the forecast quality after 48h. Therefore, high-fidelity parametrisations of unresolved physical processes are critical for an accurate simulation of all types of weather events.

Figure 4 Met Office HPC [6]

Another limitation of NWPs is the difficulty in simulating the chaotic nature of weather, which leads to errors in model initial conditions and model physics approximations that grow exponentially over time. All these limitations, combined with instability of the atmosphere at the lower and upper bound, make the forecast of rapidly developing events such as flash floods particularly challenging to predict. A further weakness of NWP forecasts is that they rely on the use of an expensive High Parallel Computing (HPC) facility (Fig. 4), owned by a handful of industrialised nations, which run coarse scale global models and high-resolution convection-permitting forecasts on domains covering area of corresponding national interest. As a result, a high resolution prediction of weather hazards, and climatological analysis remains off-limits for the vast majority of developing and third-world countries, with detrimental effects not just on first line response to weather hazards, but also for the development of economic activities such agriculture, fishing, and renewable energies in a warming climate. In the last decade, the cloud computing technological revolution led to a tremendous increase in the availability and shareability of weather data sets, which transitioned from local storage and processing to network-based services managed by large cloud computing companies, such as Amazon, Microsoft or Google, through their distributed infrastructure.

Combined with the wide availability of their cloud computing facilities, the access to weather data has become more and more democratic and ubiquitous, and consistently less dependent on HPC facilities owned by National Agencies. This transformation is not without drawbacks in case these tech giants decide to close the taps of the flow of data. During a row with the Australian government, Facebook banned access to Australian news content in Feb 2021. Although by accident, also government-related agencies such as the Bureau of Meteorology were banned, leaving citizens with restricted access to important weather information until the pages were restored. It is hoped that with more companies providing distributed infrastructure, the accessibility to vital data for citizen security will become more resilient.

The exponential accessibility of weather data sets has stimulated the development and the application of novel machine learning algorithms. As a result, weather scientists worldwide can crunch increasingly effectively multi-dimensional weather data, ultimately providing a new powerful paradigm to understand and predict the atmospheric flow based on finding relationships between the available large-scale weather datasets.

Machine learning (ML) finds meaningful representations of the patterns between the data through a series of nonlinear transformations of the input data. ML pattern recognition is distinguished into two types: supervised and unsupervised learning.

Figure5 Feed-forward neural network [6]

Supervised Learning is concerned with predicting an output for a given input. It is based on learning the relationship between inputs and outputs, using training data consisting in example input/output pairs, being divided into regression or classification, depending on the type of the output variable to be predicted (discrete or continuous). Support Vector Machine (SVM) or Regression (SVR), Artificial Neural Network (ANN, with the feed-forward step shown in Fig. 5), and Convolutional Neural Network (CNN) are examples of supervised learning.

Unsupervised learning is the task of finding patterns within data without the presence of any ground truth or labelling of the data, with a common unsupervised learning task being clustering (group of data points that are close to one another, relative to data points outside the cluster). Examples of unsupervised learning are the K-means and K-Nearest Neighbour (KNN) algorithms [7].

So far, ML algorithms have been applied to four key problems in weather prediction:  

  1. Correction of systematic error in NWP outputs, which involves post-processing data to remove biases [8]
  1. Assessment of the predictability of NWP outputs, evaluating the uncertainty and confidence scores of ensemble forecasting [9]
  1. Extreme detection, involving prediction of severe weather such as hail, gust or cyclones [10]
  1. NWP parametrizations, replacing empirical models for radiative transfer or boundary-layer turbulence with ML techniques [11]

The first key problem, which concerns the correction of systematic error in NWPs, is the most popular area of application of ML methods in meteorology. In this field, wind speed and precipitation observational data are often used to perform an ML linear regression on the NWP data with the end goal of enhancing its accuracy and resolving local details of the weather which were unresolved by NWP forecasts. Although attractive for its simplicity and robustness, linear regression presents two problems: (1) least-square methods used to solve linear regression do not scale well with the size of datasets (since matrix inversion required by least square is increasingly expensive for increasing datasets size), (2) Many relationships between variables of interest are nonlinear. Instead, classification tree-based methods have proven very useful to model non-linear weather events, from thunderstorm and turbulence detection to extreme precipitation events, and the representation of the circular nature of the wind. In fact, compared to linear regression, random trees exhibit an easy scalability with large-size datasets which have several input variables. Besides preserving the scalability to large datasets of tree-based method, ML methods such as ANN and SVM/R provide also a more generic and more powerful alternative for nonlinear-processes modelling. These improvements have come at the cost of a difficult interpretation of the underlying physical concepts that the model can identify, which is critical given that scientists need to couple these ML models with physical-equations based NWP for variable interdependence. As a matter of fact, it has proven challenging to interpret the physical meaning of the weights and nonlinear activation functions that describe in the ANN model the data patterns and relationships found by the model [12].

The second key problem, represented by the interpretation of ensemble forecasts, is being addressed by ML unsupervised learning methods such as clustering, which represents the likelihood of a forecast aggregating ensemble members by similarity. Examples include grouping of daily weather phenomena into synoptic types, defining weather regimes from upper air flow patterns, and grouping members of forecast ensembles [13].

The third key problem, which concerns the prediction of weather extremes, corresponding to weather phenomena which are a hazard to lives and economic activities, ML based methods tend to underestimate these events. The problem here lies with imbalanced datasets, since extreme events represent only a very small fraction of the total events observed [14].

The fourth key problem to which ML is currently being applied, is parametrisation. Completely new stochastic ML approaches have been developed, and their effectiveness, along with their simplicity compared to traditional empirical models has highlighted promising future applications in (moist) convection [15]

Further applications of ML methods are currently limited by intrinsic problems affecting the ML methods in relation to the challenges posed by weather data sets. While the reduction of the dimensionality of the data by ML techniques has proven highly beneficial for image pattern recognition in the context of weather data, it leads to a marked simplification of the input weather data, since it constrains the input space to individual grid cells in space or time [16]. The recent expansion of ANN into deep learning has provided new methodologies that can address these issues. This has pushed further the capability of ML models within the weather forecast domain, with CNNs providing a methodology for extracting complex patterns from large, structured datasets have been proposed, an example being the CNN model developed by Yunjie Liu in 2016 [17] to classify atmospheric rivers from climate datasets (atmospheric rivers are an important physical process for prediction of extreme rainfall events).

Figure 7 Sample images of atmospheric rivers correctly classified (true positive) by the deep CNN model in [18]

At the same time, Recursive Neural Networks (RNN), developed for natural language processing, are improving nowcasting techniques exploiting their excellent ability to work with the temporal dimension of data frames. CNN and RNN have now been combined, as illustrated in Fig. 6, providing the first nowcasting method in the context of precipitation, using radar data frames as input [18].

Figure 6 Encoding-forecasting ConvLSTM network for precipitation nowcasting [18]

While these results show a promising application of ML models to a variety of weather prediction tasks which extend beyond the area of competence of traditional NWPs, such as analysis of ensemble clustering, bias correction, analysis of climate data sets and nowcasting, they also show that ML models are not ready to replace NWP to forecast synoptic-scale and mesoscale weather patterns.

As a matter of fact, NWPs have been developed and improved for over 60 years with the very purpose to simulate very accurately and reliably the wind, pressure, temperature and other relevant prognostic fields, so it would be unreasonable to expect ML models to outperform NWPs on such tasks.

It is also true that, as noted earlier, the amount of available data will only grow in the coming decades, so it will be critical as well as strategic to develop ML models capable to extract patterns and interpret the relationships within such data sets, complementing NWP capabilities. But how long before an ML model will be capable to replace an NWP by crunching the entire set of historical observations of the atmosphere, extracting the patterns and the spatial-temporal relationships between the variables, and then performing weather forecasts?

Acknowledgement: I would like to thank my colleagues and friends Brian Lo, James Fallon, and Gabriel M. P. Perez, for reading and providing feedback on this article.


  1. Buizza, R., Houtekamer, P., Pellerin, G., Toth, Z., Zhu, Y. and Wei, M. (2005) A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon Weather Rev, 133, 1076 – 1097 
  1. Lean, H. and Clark, P. (2003) The effects of changing resolution on mesocale modelling of line convection and slantwise circulations in FASTEX IOP16. Q J R Meteorol Soc, 129, 2255–2278 
  1. Bishop, C., and Christopher, M., Pattern Recognition and Machine Learning, Springer 
  1. J. L. Aznarte and N. Siebert, “Dynamic Line Rating Using Numerical Weather Predictions and Machine Learning: A Case Study,” in IEEE Transactions on Power Delivery, vol. 32, no. 1, pp. 335-343, Feb. 2017, doi: 10.1109/TPWRD.2016.2543818. 
  1. Foley, Aoife M et al. (2012). “Current methods and advances in forecasting of wind power generation”. In: Renewable Energy 37.1, pp. 1–8. 
  1. McGovern, Amy et al. (2017). “Using artificial intelligence to improve real-time decision making for high-impact weather”. In: Bulletin of the American Meteorological Society 98.10, pp. 2073–2090 
  1. O’Gorman, Paul A and John G Dwyer (2018). “Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change and extreme events”. In: arXiv preprint arXiv:1806.11037 
  1. Moghim, Sanaz and Rafael L Bras (2017). “Bias correction of climate modeled temperature and precipitation using artificial neural networks”. In: Journal of Hydrometeorology 18.7, pp. 1867–1884.  
  1. Camargo S J, Robertson A W Gaffney S J Smyth P and M Ghil (2007). “Cluster analysis of typhoon tracks. Part I: General properties”. In: Journal of Climate 20.14, pp. 3635–3653. 
  1. Ahijevych, David et al. (2009). “Application of spatial verification methods to idealized and NWP-gridded precipitation forecasts”. In: Weather and Forecasting 24.6, pp. 1485–1497. 
  1. Berner, Judith et al. (2017). “Stochastic parameterization: Toward a new view of weather and climate models”. In: Bulletin of the American Meteorological Society 98.3, pp. 565–588. 
  1. Fan, Wei and Albert Bifet (2013). “Mining big data: current status, and forecast to the future”. In: ACM sIGKDD Explorations Newsletter 14.2, pp. 1–5 
  1. Liu, Yunjie et al. (2016). “Application of deep convolutional neural networks for detecting extreme weather in climate datasets”. In: arXiv preprint arXiv:1605.01156. 
  1. Xingjian, SHI et al. (2015). “Convolutional LSTM network: A machine learning approach for precipitation nowcasting”. In: Advances in neural information processing systems, pp. 802–810. 

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s