Support vector machine for classification of space weather

Carl Haines –

In a recent blog post, I discussed the use of the analogue ensemble (AnEn), or “similar-day” approach to forecasting geomagnetic activity. In this post I will look at the use of support vector machines (SVMs), a machine learning approach, to the same problem and compare the performance of the SVM to the AnEn. An implementation of the SVM has been developed for this project in python and is available at

Space weather encompasses a range of impacts on Earth caused by changes in the near-earth plasma due to variable activity at the Sun. These impacts include damage to satellites, interruptions to radio communications and damage to power grids. For this reason, it is useful to forecast the occurrence, intensity and duration of heightened geomagnetic activity, which we call geomagnetic storms.  

As in the previous post, the measure of geomagnetic activity used is the aaH index which has been developed by Mike Lockwood in Reading. The  aaH  index  gives a global measure of geomagnetic activity at a 3-hour resolution. In this index, the minimum value is zero and larger values represent more intense geomagnetic activity. 

The SVM is a commonly used classification algorithm which we implement here to classify whether a storm will occur. Given a sample of the input and the associated classification labels, the SVM will find a function that separates these input features by their class label. This is simple if the classes are linearly separable, as the function is a hyperplane. The samples lying closest to the hyperplane are called support vectors and the distance between these samples and the hyperplane is maximised. 

Figure 1 – A diagram explaining the kernel trick used by SVMs. This figure has been adopted from (Towards data science, n.d.) 

Typically, the samples are not linearly separable, so we employ Cover’s theorem which states that linearly inseparable classification problems are more likely to be linearly separable when cast non-linearly into a higher dimensional space. Therefore, we use a kenel trick to throw the inputs into a higher dimensional feature space, as depicted in Figure 1, to make it more separable. Further explanation is available in (Towards data science, n.d.) 

Based on aaH values in the 24-hour training window, the SVM predicts whether the next 3 hours will be either a storm or not. By comparing this dichotomous hindcast with the observed aaH, the outcome will be one of True Positive (TP, where a storm is correctly predicted), True Negative (TN, where no storm is correctly predicted), False Positive (FP, where a storm is predicted but not observed), or False Negative (FN, where a storm is not predicted but is observed). This is shown in the form of a contingency table in Figure 2. 

Figure 2 – Contingency table for the SVM classifying geomagnetic activity into the “no-storm” and “storm” classes. THe results have been normalised across the true label.

For development of the SVM, the aaH data has been separated into independent training and test intervals. These intervals are chosen to be alternate years. This is longer than the auto-correlation in the data (choosing, e.g. alternate 3-hourly data points, would not generate independent training and test data sets) but short enough that we assume there will not be significant aliasing with solar cycle variations. 

Training is an iterative process, whereby a cost function is minimised. The cost function is a combination of the relative proportion of TP, TN, FP and FN. Thus, while training itself, an SVM attempts to classify labelled data i.e., data belonging to a known category, in this case “storm” and “no storm” on the basis of the previous 24 hours of aaH. If the SVM makes an incorrect prediction it is penalised through a cost function which the SVM minimises.  The cost parameter determines the degree to which the SVM is penalised for a misclassification in training which allows for noise in the data.  

It is common that data with a class imbalance, that is containing many more samples from one class than the other, causes the classifier to be biased towards the majority class. In this case, there are far more non-storm intervals than storm intervals. Following (McGranaghan, Mannucci, Wilson, Mattman, & Chadwick, 2018), we define the cost of mis-classifying each class separately. This is done through the weight ratio Wstorm : Wno storm. Increasing the Wstorm increases the frequency at which the SVM predicts a storm and it follows that it predicts no storm at a reduced frequency. In this work we have varied Wstorm and kept Wno storm constant at 1. A user of the SVM method for forecasting may wish to tune the class weight ratio to give an appropriate ratio of false alarms and hit rate dependent on their needs. 

Different forecast applications will have different tolerances for false alarms and missed events.  To accommodate this, and as a further comparison of the hindcasts, we use a Cost/Loss analysis. In short, C is the economic cost associated with taking mitigating action when an event is predicted (whether or not it actually occurs) and L is the economic loss suffered due to damage if no mitigating action is taken when needed. For a deterministic method, such as the SVM, each time a storm is predicted will incur a cost C. Each time a storm is not predicted but a storm occurs a loss L is incurred. If no storm is predicted and no storm occurs, then no expense in incurred. By considering some time interval, the total expense can be computed by summing C and L. Further information, including the formular for computing the potential economic value (PEV) can be found in (Owens & Riley, 2017). 

A particular forecast application will have a C/L ratio in the domain (0,1). This is because a C/L of 0 would mean it is most cost effective to take constant mitigating action and a C/L of 1 or more means that mitigating action is never cost effective. In either case, no forecast would be helpful. The power of a Cost/Loss analysis is that it allows us to evaluate our methods for the entire range of potential forecast end users without specific knowledge of the forecast application requirements. End users can then easily interpret whether our methods fit their situation.  

Figure 3 – A cost loss analysis comparing the potential economic value of a range of SVMs to the AnEn. 

Figure 3 shows the potential economic value (PEV) of the SVM with a range of class weights (CW), probabilistic AnEn and 27-day recurrence. The shaded regions indicate which hindcast has the highest PEV for that Cost/Loss ratio. The probabilistic AnEn has the highest PEV for the majority of the Cost/Loss domain although SVM has higher PEV for lower Cost/Loss ratios. It highlights that the `best’ hindcast is dependent on the context in which it is to be employed. 

In summary, we have implemented an SVM for the classification of geomagnetic storms and compared the performance to that of the AnEn which was discussed in a previous blog post. The SVM and AnEn generally perform similarly in a Cost/Loss analysis and the best method will depend on the requirements of the end user. The code for the SVM is available at and the AnEn at


Burges, C. (1998). A tutorial on support vector machines for pattern recognition. Data mining an knowledge discovery

McGranaghan, R., Mannucci, A., Wilson, B., Mattman, C., & Chadwick, R. (2018). New Capabilities for Prediction of High-Latitude Ionospheric Scin-669tillation: A Novel Approach With Machine Learning. Space Weather

Owens, M., & Riley, P. (2017). Probabilistic Solar Wind Forecasting Using Large Ensembles of Near-Sun Conditions With a Simple One-Dimensional “Upwind” Scheme. Space weather

Towards data science. (n.d.). Retrieved from 

Forecasting space weather using “similar day” approach

Carl Haines –

Space weather is a natural threat that requires good quality forecasting with as much lead time as possible. In this post I outline the simple and understandable analogue ensemble (AnEn) or “similar day” approach to forecasting. I focus mainly on exploring the method itself and, although this work forecasts space weather through a timeseries of ground level observations, AnEn can be applied to many prediction tasks, particularly time series with strong auto-correlation. AnEn has previously been used to predict wind speed [1], temperature [1] and solar wind [2]. The code for AnEn is available at should you wish to try out the method for you own application. 

The idea behind AnEn is to take a set of recent observations, look back in a historic dataset for analogous periods, then take what happened following those analogous periods as the forecast. If multiple analogous periods are used, then an ensemble of forecasts can be created giving a distribution of possible outcomes with probabilistic information. 

Figure 1 – An example of AnEn applied to a space weather event with forecast time t0. The black line shows the observations, the grey line shows the ensemble members, the red line shows the median of the ensemble and the yellow and green lines are reference forecasts. 

Figure 1 is an example of a forecast made using the AnEn method where the forecast is made at t0. The 24-hours of observations (black) prior to tare matched to similar periods in the historic dataset (grey). Here I have chosen to give the most recent observations the most weighting as they hold the most relevant information. The grey analogue lines then flow on after t0 forming the forecast. Combined, these form an ensemble and the median of these is shown in red. The forecast can be chosen to be the median (or any percentile) of the ensemble or a probability of an event occurring can be given by counting how many of the ensemble member do/don’t experience the event.  

Figure 1 also shows two reference forecasts, namely 27-day recurrence and climatology, as benchmarks to beat. 27-day recurrence uses the observation from 27-days ago as the forecast for today. This is reasonable because the Sun rotates every 27-days as seen from earth so broadly speaking the same part of the Sun is emitting the relevant solar wind on timescales larger than 27-days. 

To quantify how well AnEn works as a forecast I ran the forecast on the entire dataset by repeatedly changing the forecast time t0 and applied two metrics, namely mean absolute error (MAE) and skill, to the median of the ensemble members. MAE is the size of the mean difference between the forecast made by AnEn and what was actually observed. The mean of the absolute errors over all the forecasts (taken as median of the ensemble) is taken and we end up with a value for each lead time. Figure 2 shows the MAE for AnEn median and the reference forecasts. We see that AnEn has the smallest (best) MAE at short lead times and outperforms the reference forecasts for all lead times up to a week. 

Figure 2 – The mean absolute error of the AnEn median and reference forecasts.

An error metric such as MAE cannot take into account that certain conditions are inherently more difficult to forecast such as storm times. For this we can use a skill metric defined by  

{\text{Skill} = 1 - \frac{\text{Forecast error}}{\text{Reference error}}}

where in this case we use climatology as the reference forecast. Skill can take any value between -\infty and 1 where a perfect forecast would receive a value of 1 and an unskilful forecast would receive a value of 0. A negative value of skill signifies that the forecast is worse than the reference forecast. 

Figure 3 shows the skill of AnEn and 27-day recurrence with respect to climatology. We see that AnEn is most skilful for short lead times and outperforms 27-day recurrence for all lead times considered.  

Figure 3 – The skill of the AnEn median and 27-day recurrence with respect to climatology.

In summary, the analogue ensemble forecast method matches current conditions with historical events and lifts the previously seen timeseries as the prediction. AnEn seems to perform well for this application and outperforms the reference forecasts of climatology and 27-day recurrence. The code for AnEn is available at

The work presented here makes up a part of a paper that is under review in the journal of Space Weather. 

Here, AnEn has been applied to a dataset from the space weather domain. If you would like to find out more about space weather then take a look at these previous blog posts from Shannon Jones ( and I ( 

[1] Delle Monache, L., Eckel, F. A., Rife, D. L., Nagarajan, B., & Searight, K.(2013) Probabilistic Weather Prediction with an Analog Ensemble. doi: 10.1175/mwr-d-12-00281.1 

[2] Owens, M. J., Riley, P., & Horbury, T. S. (2017a). Probabilistic Solar Wind and Ge-704omagnetic Forecasting Using an Analogue Ensemble or “Similar Day” Approach. doi: 10.1007/s11207-017-1090-7 

The Variation of Geomagnetic Storm Duration with Intensity


Haines, C., M. J. Owens, L. Barnard, M. Lockwood, and A. Ruffenach, 2019: The Variation of Geomagnetic Storm Duration with Intensity. Solar Physics, 294,

Variability in the near-Earth solar wind conditions can adversely affect a number of ground- and space-based technologies. Some of these space weather impacts on ground infrastructure are expected to increase primarily with geomagnetic storm intensity, but also storm duration, through time-integrated effects. Forecasting storm duration is also necessary for scheduling the resumption of safe operating of affected infrastructure. It is therefore important to understand the degree to which storm intensity and duration are related.

In this study, we use the recently re-calibrated aa index, aaH to analyse the relationship between geomagnetic storm intensity and storm duration over the past 150 years, further adding to our understanding of the climatology of geomagnetic activity. In particular, we construct and test a simple probabilistic forecast of storm duration based on storm intensity.

Using a peak-above-threshold approach to defining storms, we observe that more intense storms do indeed last longer but with a non-linear relationship (Figure 1).

Figure 1: The mean duration (red) and number of storms (blue) plotted as a function of storm intensity.

Next, we analysed the distribution of storm durations in eight different classes of storms dependent on the peak intensity of the storm. We found them to be approximately lognormal with parameters depending on the storm intensity. A lognormal distribution is defined by the mean of the logarithm of the values, μ, and the standard deviation of the logarithm of the values, σ. These parameters were found from the observed durations in each intensity class through Maximum Likelihood Estimation (MLE) and used to create a lognormal distribution, plotted in Figure 2 in dark purple. The light purple distribution shows a histogram of the observed data as an estimate of the probability density function (PDF). By eye, the lognormal distribution provides a reasonable first-order match at all intensity thresholds.

Figure 2: The distribution of duration for storms with a peak between 150 and 190nT.

On this basis we created a method to probabilistically predict storm duration given peak intensity. For each of the peak intensity classes, we have calculated the values of μ and σ for the lognormal fits to the duration distributions shown as the black points in Figure 3. It is clear from the points in the left panel of Figure 3 that μ increases as intensity increases, agreeing with the previous results in Figure 1 (i.e., duration increases as intensity increases).

The parameter μ can be approximated as a function of storm intensity by:

μ(intensity) = A ln (B intensity−C)

where A, B and C are free parameters. A least squares fit was implemented, and the coefficients A, B and C were found to be 0.455, 4.632, 283.143 respectively and this curve is plotted, along with uncertainty bars, in Figure 3 (left). Although the fit is based on weighted bin-centres of storm intensity, the equation can be used to interpolate for a given value of intensity. σ can be approximated by a linear fit to give σ as a function of the peak intensity. Figure 3 (right) shows the best fit line which has a shallow gradient of −5.08×10−4 and y-intercept at 0.659.

Figure 3: (Left) The mean of the log-space as a function of intensity. (Right) The standard deviation of the log-space as a function of intensity.

These equations can be used to find lognormal parameters as a function of storm peak intensity. From these, a distribution of duration can be created and hence a probabilistic estimate of the duration of this storm is available. This can be used to predict the probability a storm will last at least e.g. 24 hours. Figure 4 shows the output of the model for a range of storm peak intensity compared against a test set of the aaH index. The model has good agreement with the observations and provides a robust method for estimating geomagnetic storm duration.

The results demonstrate significant advancements in not only understanding the properties and structure of storms, but also how we can predict and forecast these dynamic and hazardous events.

For more information, please see the open-access paper.

Figure 4: The probability that a storm will last at least 24 hours plotted as a function of storm intensity. The black line shows the observed probability and the red line shows the model output.