Diagnosing solar wind forecast errors

Harriet Turner – h.turner3@pgr.reading.ac.uk

The solar wind is a continual outflow of charged particles that comes off the Sun, ranging in speed from 250 to 800 km s-1. During the first six months of my PhD, I have been investigating the errors in a type of solar wind forecast that uses spacecraft observations, known as corotation forecasts. This was the topic of my first paper, where I focussed on extracting the forecast error that occurs due to a separation in the spacecraft latitude. I found that up to a latitudinal separation of 6 degrees, the error contribution was approximately constant. Above 6 degrees, the error contribution increases as the latitudinal separation increases. In this blog post I will explain the importance of forecasting the solar wind and the principle behind corotation forecasts. I will also explain how this work has wider implications for future space missions and solar wind forecasting.

The term “space weather” refers to the changing conditions in near-Earth space. Extreme space weather events can cause several effects on Earth, such as damaging power grids, disrupting communications, knocking out satellites and harming the health of humans in space or on high-altitude flights (Cannon, 2013). These effects are summarised in Figure 1. It is therefore important to accurately forecast space weather to help mitigate against these effects. Knowledge of the background solar wind is an important aspect of space weather forecasting as it modulates the severity of extreme events. This can be achieved through three-dimensional computer simulations or through more simple methods, such as corotation forecasts as discussed below.

Figure 1. Cosmic rays, solar energetic particles, solar flare radiation, coronal mass ejections and energetic radiation belt particles cause space weather. Subsequently, this produces a number of effects on Earth. Source: ESA.

Solar wind flow is mostly radial away from the Sun, however the fast/slow structure of the solar wind rotates round with the Sun. If you were looking down on the ecliptic plane (where the planets lie, at roughly the Sun’s equator), then you would see a spiral shape of fast and slow solar wind, as in Figure 2. This makes a full rotation in approximately 27 days. As this rotates around, it allows us to use observations on this plane as a forecast for a point further on in that rotation, assuming a steady-state solar wind (i.e., the solar wind does not evolve in time). For example, in Figure 2, an observation from the spacecraft represented by the red square could be used as a forecast at Earth (blue circle), some time later. This time depends on the longitudinal separation between the two points, as this determines the time it takes for the Sun to rotate through that angle.

Figure 2. The spiral structure of the solar wind, which rotates anticlockwise. Here, STA and STB are the STEREO-A and STEREO-B spacecraft respectively. The solar wind shown here is the radial component. Source: HUXt model (Owens et al, 2020).

In my recent paper I have been investigating how the corotation forecast error varies with the latitudinal separation of the observation and forecast points.  Latitudinal separation varies throughout the year, and it was theorised that it should have a significant impact on the accuracy of corotation forecasts. I used the two spacecraft from the STEREO mission, which are on the same plane as Earth, and a dataset for near-Earth. This allowed for six different configurations to compute corotation forecasts, with a maximum latitudinal separation of 14 degrees. I analysed the 18-month period from August 2009 to February 2011 to help eliminate other affecting variables. Figure 3 shows the relationship between forecast error and latitudinal separation. Up to approximately 6 degrees, there is no significant relationship between error and latitudinal separation. Above this, however, the error increases approximately linearly with the latitudinal separation.

Figure 3. Variation of forecast error with the latitudinal separation between the spacecraft making the observation and the forecast location. Error bars span one standard error on the mean.

This work has implications for the future Lagrange space weather monitoring mission, due for launch in 2027. The Lagrange spacecraft will be stationed in a gravitational null, 60degrees in longitude behind Earth on the ecliptic plane. Gravitational nulls occur when the gravitational fields between two or more massive bodies balance out. There are five of these nulls, called the Lagrange points, and locating a spacecraft at one reduces the amount of fuel needed to stay in position. The goal of the Lagrange mission is to provide a side-on view of the Sun-Earth line, but it also presents an opportunity for consistent corotation forecasts to be generated at Earth. However, the Lagrange spacecraft will oscillate in latitude compared to Earth, up to a maximum of about 5 degrees. My results indicate that the error contribution from latitudinal separation would be approximately constant.

The next steps are to use this information to help improve the performance of solar wind data assimilation. Data assimilation (DA) has led to large improvements in terrestrial weather forecasting and is beginning to be used in space weather forecasting. DA combines observations and model output to find an optimum estimation of reality. The latitudinal information found here can be used to inform the DA scheme how to better handle the observations and to, hopefully, produce an improved solar wind representation.

The work I have discussed here has been accepted into the AGU Space Weather journal and is available at https://agupubs.onlinelibrary.wiley.com/doi/epdf/10.1029/2021SW002802.

References

Cannon, P.S., 2013. Extreme space weather – A report published by the UK royal academy of engineering. Space Weather, 11(4), 138-139.  https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/swe.20032

ESA, 2018. https://www.esa.int/ESA_Multimedia/Images/2018/01/Space_weather_effects 

Owens, M.J., Lang, M.S., Barnard, L., Riley, P., Ben-Nun, M., Scott, C.J., Lockwood, M., Reiss, M.A., Arge, C.N. & Gonzi, S., 2020. A Computationally Efficient, Time-Dependent Model of the Solar Wind for use as a Surrogate to Three-Dimensional Numerical Magnetohydrodynamic Simulations. Solar Physics, 295(3), https://doi.org/10.1007/s11207-020-01605-3

Connecting Global to Local Hydrological Modelling Forecasting – Virtual Workshop

Gwyneth Matthews g.r.matthews@pgr.reading.ac.uk
Helen Hooker h.hooker@pgr.reading.ac.uk 

ECMWF- CEMS – C3S – HEPEX – GFP 

What was it? 

The workshop was organised under the umbrella of ECMWF, the Copernicus services CEMS and C3S, the Hydrological Ensemble Prediction EXperiment (HEPEX) and the Global Flood Partnership (GFP). The workshop lasted 3 days, with a keynote speaker followed by Q&A at the start of each of the 6 sessions. Each keynote talk focused on a different part of the forecast chain, from hybrid hydrological forecasting to the use of forecasts for anticipatory humanitarian action, and how the global and local hydrological scales could be linked. Following this were speedy poster pitches from around the world and poster presentations and discussion in the virtual ECMWF (Gather.town).  

Figure 1: Gather.town was used for the poster sessions and was set up to look like the ECMWF site in Reading, complete with a Weather Room and rubber ducks. 

What was your poster about? 

Gwyneth – I presented Evaluating the post-processing of the European Flood Awareness System’s medium-range streamflow forecasts in Session 2 – Catchment-scale hydrometeorological forecasting: from short-range to medium-range. My poster showed the results of the recent evaluation of the post-processing method used in the European Flood Awareness System. Post-processing is used to correct errors and account for uncertainties in the forecasts and is a vital component of a flood forecasting system. By comparing the post-processed forecasts with observations, I was able to identify where the forecasts were most improved.  

Helen – I presented An evaluation of ensemble forecast flood map spatial skill in Session 3 – Monitoring, modelling and forecasting for flood risk, flash floods, inundation and impact assessments. The ensemble approach to forecasting flooding extent and depth is ideal due to the highly uncertain nature of extreme flooding events. The flood maps are linked directly to probabilistic population impacts to enable timely, targeted release of funding. The Flood Foresight System forecast flood inundation maps are evaluated by comparison with satellite based SAR-derived flood maps so that the spatial skill of the ensemble can be determined.  

Figure 2: Gwyneth (left) and Helen (right) presenting their posters shown below in the 2-minute pitches. 

What did you find most interesting at the workshop? 

Gwyneth – All the posters! Every session had a wide range of topics being presented and I really enjoyed talking to people about their work. The keynote talks at the beginning of each session were really interesting and thought-provoking. I especially liked the talk by Dr Wendy Parker about a fitness-for-purpose approach to evaluation which incorporates how the forecasts are used and who is using the forecast into the evaluation.  

Helen – Lots! All of the keynote talks were excellent and inspiring. The latest developments in detecting flooding from satellites include processing the data using machine learning algorithms directly onboard, before beaming the flood map back to earth! If openly available and accessible (this came up quite a bit) this will potentially rapidly decrease the time it takes for flood maps to reach both flood risk managers dealing with the incident and for use in improving flood forecasting models. 

How was your virtual poster presentation/discussion session? 

Gwyneth – It was nerve-racking to give the mini-pitch to +200 people, but the poster session in Gather.town was great! The questions and comments I got were helpful, but it was nice to have conversations on non-research-based topics and to meet some of the EC-HEPEXers (early career members of the Hydrological Ensemble Prediction Experiment). The sessions felt more natural than a lot of the virtual conferences I have been to.  

Helen – I really enjoyed choosing my hairdo and outfit for my mini self. I’ve not actually experienced a ‘real’ conference/workshop but compared to other virtual events this felt quite realistic. I really enjoyed the Gather.town setting, especially the duck pond (although the ducks couldn’t swim or quack! J). It was great to have the chance talk about my work and meet a few people, some thought-provoking questions are always useful.  

Helicopter Underwater Escape Training for Arctic Field Campaign

Hannah Croad h.croad@pgr.reading.ac.uk

The focus of my PhD project is investigating the physical mechanisms behind the growth and evolution of summer-time Arctic cyclones, including the interaction between cyclones and sea ice. The rapid decline of Arctic sea ice extent is allowing human activity (e.g. shipping) to expand into the summer-time Arctic, where it will be exposed to the risks of Arctic weather. Arctic cyclones produce some of the most impactful Arctic weather, associated with strong winds and atmospheric forcings that have large impacts on the sea ice. Hence, there is a demand for improved forecasts, which can be achieved through a better understanding of Arctic cyclone mechanisms. 

My PhD project is closely linked with a NERC project (Arctic Summer-time Cyclones: Dynamics and Sea-ice Interaction), with an associated field campaign. Whereas my PhD project is focused on Arctic cyclone mechanisms, the primary aims of the NERC project are to understand the influence of sea ice conditions on summer-time Arctic cyclone development, and the interaction of cyclones with the summer-time Arctic environment. The field campaign, originally planned for August 2021 based in Svalbard in the Norwegian Arctic, has now been postponed to August 2022 (due to ongoing restrictions on international travel and associated risks for research operations due to the evolving Covid pandemic). The field campaign will use the British Antarctic Survey’s low-flying Twin Otter aircraft, equipped with infrared and lidar instruments, to take measurements of near-surface fluxes of momentum, heat and moisture associated with cyclones over sea ice and the neighbouring ocean. These simultaneous observations of turbulent fluxes in the atmospheric boundary layer and sea ice characteristics, in the vicinity of Arctic cyclones, are needed to improve the representation of turbulent exchange over sea ice in numerical weather prediction models. 

Those wishing to fly onboard the Twin Otter research aircraft are required to do Helicopter Underwater Escape Training (HUET). Most of the participants on the course travel to and from offshore facilities, as the course is compulsory for all passengers on the helicopters to rigs. In the unlikely event that a helicopter must ditch on the ocean, although the aircraft has buoyancy aids, capsize is likely because the engine and rotors make the aircraft top heavy. I was apprehensive about doing the training, as having to escape from a submerged aircraft is not exactly my idea of fun. However, I realise that being able to fly on the research aircraft in the Arctic is a unique opportunity, so I was willing to take on the challenge! 

The HUET course is provided by the Petans training facility in Norwich. John Methven, Ben Harvey, and I drove to Norwich the night before, in preparation for an early start the next day. We spent the morning in the classroom, covering helicopter escape procedures and what we should expect for the practical session in the afternoon. We would have to escape from a simulator recreating a crash landing on water. The simulator replicates a helicopter fuselage, with seats and windows, attached to the end of a mechanical arm for controlled submersion and rotation. The procedure is (i) prepare for emergency landing: check seatbelt is pulled tight, headgear is on, and that all loose objects are tucked away, (ii) assume the brace position on impact, and (iii) keep one hand on the window exit and the other on your seatbelt buckle. Once submerged, undo your seatbelt and escape through the window. After a nervy lunch, it was time to put this into practice. 

The aircraft simulator being submerged in the pool (Source: Petans promotional video

The practical part of the course took place in a pool (the temperature resembled lukewarm bath water, much warmer than the North Atlantic!). We were kitted up with two sets of overalls over our swimming costumes, shoes, helmets, and jackets containing a buoyancy aid. We then began the training in the aircraft simulator. Climb into the aircraft and strap yourself into a seat. The seatbelt had to be pulled tight, and was released by rotating the central buckle. On the pilots command, prepare for emergency landing. Assume the brace position, and the aircraft drops into the water. Hold on to the window and your seatbelt buckle, and as the water reaches your chest, take a deep breath. Wait for the cabin to completely fill with water and stop moving – only then undo your seatbelt and get out! 

The practical session consisted of three parts. In the first exercise, the aircraft was submerged, and you had to escape through the window. The second exercise was similar, except that panes were fitted on the windows, which you had to push out before escaping. In the final exercise, the aircraft was submerged and rotated 180 degrees, so you ended up upside down (and with plenty of water up your nose), which was very disorientating! Each exercise required you to hold your breath for roughly 10 seconds at a time. Once we had escaped and reached the surface, we deployed our buoyancy aids, and climbed to safety onto the life raft. 

Going for a spin! The aircraft simulator being rotated with me strapped in
Ben and I happy to have survived the training!

The experience was nerve-wracking, and really forced me to push myself out of my comfort zone. I didn’t need to be too worried though, even after struggling with undoing the seatbelt a couple of times, I was assisted by the diving team and encouraged to go again. I was glad to get through the exercises, and pass the course along with the others. This was an amazing experience (definitely not something I expected to do when applying for a PhD!), and I’m now looking forward to the field campaign next year. 

Forecast Verification Summer School

Lily Greig – l.greig@pgr.reading.ac.uk

A week-long summer school on forecast verification was held jointly at the end of June by the MPECDT (Mathematics of Planet Earth Centre for Doctoral Training) and JWGFVR (Joint Working Group on Forecast Verification Research). The school featured lectures from scientists and academics from many different countries around the world including Brazil, USA and Canada. They each specialised in different topics within forecast verification. Participants gained a large overview of the field and how the fields within it interact.

Structure of school

The virtual school consisted of lectures from individual members of the JWGFVR on their own subjects, along with drop-in sessions for asking questions and dedicated time to work on group projects. Four groups of 4-5 students were given an individual forecast verification challenge. The themes of the projects were precipitation forecasts, comparing high resolution global model and local area model wind speed forecasts, and ensemble seasonal forecasts. The latter was the topic of our project.

Content

The first lecture was given by Barbara Brown, who provided a broad summary of verification and gave examples of questions that verifiers may ask themselves as they attempt to assess the “goodness” of a forecast. The next day, a lecture by Barbara Casati covered continuous scores (verification of continuous variables e.g., temperature), such as linear bias, mean-squared error (MSE) and Pearson coefficient. She also outlined the deficits of different scores and how it is best to use a variety of them when assessing the quality of a forecast. Marion Mittermaier then spoke about categorical scores (yes/no events or multi category events such as precipitation type). She gave examples such as contingency tables which portray how well a model is able to predict a given event, based on hit rates (how often the model predicted an event when the event happened), and false alarm rates (how often the model predicted the event when it didn’t happen). Further lectures were given by Ian Joliffe on methods of determining the significance of your forecast scores, Nachiketa Acharya on probabilistic scores and ensembles, Caio Coelho on sub-seasonal to seasonal timescales, and then Raghavendra Ashrit, Eric Gilleland and Caren Marzban on severe weather, spatial verification and experimental design. The lectures have been made available online and you can find them here.

Forecast Verification

So, forecast verification is as it sounds: a part of assessing the ‘goodness’ of a forecast as opposed to its value. Verification is helpful for economic purposes (e.g. decision making), as well as administrative and scientific ones (e.g. identifying model flaws). The other aspect of measuring how well a forecast is performing is knowing the user’s needs, and therefore how to apply the forecast. It is important to consider the goal of your verification process beforehand, as it will outline your choice of metrics and your assessment of them. An example of how forecast goodness hinges on the user was given by Barbara in her talk: a precipitation forecast may have a spatial offset of where a rain patch falls, but if both observation and forecast fall along the flight path, this may be all the aviation traffic strategic planner needs to know. For a watershed manager on the ground, however, this would not be a helpful forecast. The lecturers also emphasised the importance of performing many different measures on a forecast and then understanding the significance of your measures in order to help you understand its overall goodness. Identifying standards of comparison for your forecast is also important, such as persistence or climatology. Then there are further challenges such as spatial verification, which requires methods of ‘matching’ the location of your observations with the model predictions on the model grid.

Figure 1: Problem statement for group presentation on 2m temperature ensemble seasonal forecasts, presented by Ryo Kurashina

Group Project

Our project was on verification of 2 metre temperature ensemble seasonal forecasts (see Figure 1). We were looking at seasonal forecast data with a 1-month lead time for the summer months for three different models and investigating ways of validating the forecasts, finally deciding which one was the better. We decided to focus on the models’ ability to predict hot and cold events as a simple metric for El Nino. We looked at scatter plots and rank histograms to investigate the biases in our data, Brier scores for assessing model accuracy (level of agreement between forecast and truth) and Receiver Operating Characteristic curves to look model skill (the relative accuracy of the forecast over some reference forecast). The ROC curve (see Fig. 2) refers to the curve formed by plotting hit rates against false alarm rates based on probability thresholds. The further above the diagonal line your curve lies, the better your forecast is at discriminating events compared to a random coin toss. The combination of these verification methods were used to assess which model we thought was best.

Of course, virtual summer schools are less than ideal compared to the real (in person) deal, but with Teams meetings, shared code and chat channel we made the most of it. It was fun to work with everyone, even (or especially?) if the topic was new for all of us.

Figure 2: Presenting our project during group project presentations on Friday

Conclusions

The summer school was incredibly smoothly run, very engaging to people both new and experienced in the topic and provided plenty of opportunity to ask questions to the enthusiastic lecturers. Would recommend to PhD students working with forecasts and wanting to assess them!

The effect of surface heat fluxes on the evolution of storms in the North Atlantic storm track

Andrea Marcheggiani – a.marcheggiani@pgr.reading.ac.uk

Diabatic processes are typically considered as a source of energy for weather systems and as a primary contributing factor to the maintenance of mid-latitude storm tracks (see Hoskins and Valdes 1990 for some classical reading, but also a more recent reviews, e.g. Chang et al. 2002). However, surface heat exchanges do not necessarily act as a fuel for the evolution of weather systems: the effects of surface heat fluxes and their coupling with lower-tropospheric flow can be detrimental to the potential energy available for systems to grow. Indeed, the magnitude and sign of their effects depend on the different time (e.g., synoptic, seasonal) and length (e.g., global, zonal, local) scales which these effects unfold at.


Figure 1: Composites for strong (a-c) and weak (d-f) values of the covariance between heat flux and temperature time anomalies.

Heat fluxes arise in response to thermal imbalances which they attempt to neutralise. In the atmosphere, the primary thermal imbalances that are observed correspond with the meridional temperature gradient caused by the equator—poles differential radiative heating from the Sun, and the temperature contrasts at the air—sea interface which essentially derives from the different heat capacities of the oceans and the atmosphere.

In the context of the energetic scheme of the atmosphere, which was first formulated by Lorenz (1955) and commonly known as Lorenz energy cycle, the meridional transport of heat (or dry static energy) is associated with conversion of zonal available potential energy to eddy available potential energy, while diabatic processes at the surface coincide with generation of eddy available potential energy.

Figure 2: Phase portrait of FT covariance and mean baroclinicity. Streamlines indicate average circulation in the phase space (line thickness proportional to phase speed). The black shaded dot in the top left corner indicates the size of the Gaussian kernel used in the smoothing process. Colour shading indicates the number of data points contributing to the kernel average

The sign of the contribution from surface heat exchanges to the evolution on weather systems is not univocal, as it depends on the specific framework which is used to evaluate their effects. Globally, these have been estimated to have a positive effect on the potential energy budget (Peixoto and Oort, 1992) while locally the picture is less clear, as heating where it is cold and cooling where it is warm would lead to a reduction in temperature variance, which is essentially available potential energy.

The first part of my PhD focussed on assessing the role of local air—sea heat exchanges on the evolution of synoptic systems. To that extent, we built a hybrid framework where the spatial covariance between time anomalies of sensible heat flux F and lower-tropospheric air temperature T  is taken as a measure of the intensity of the air—sea thermal coupling. The time anomalies, denoted by a prime, are defined as departures from a 10-day running mean so that we can concentrate on synoptic variability (Athanasiadis and Ambaum, 2009). The spatial domain where we compute the spatial covariance extends from 30°N to 60°N and from 30°W to 79.5°W, which corresponds with the Gulf Stream extension region, and to focus on air—sea interaction, we excluded grid points covered by land or ice.

This leaves us with a time series for F’—T’ spatial covariance, which we also refer to as FT index.

The FT index is found to be always positive and characterised by frequent bursts of intense activity (or peaks). Composite analysis, shown in Figure 1 for mean sea level pressure (a,d), temperature at 850hPa (b,e) and surface sensible heat flux (c,f), indicates that peaks of the FT index (panels a—c) correspond with intense weather activity in the spatial domain considered (dashed box in Figure 1) while a more settled weather pattern is observed to be typical when the FT index is weak (panels d—f).


Figure 3: Phase portraits for spatial-mean T (a) and cold sector area fraction (b). Shading in (a) represents the difference between phase tendency and the mean value of T, as reported next to the colour bar. Arrows highlight the direction of the circulation, kernel-averaged using the Gaussian kernel shown in the top-left corner of each panel.

We examine the dynamical relationship between the FT index and the area-mean baroclinicity, which is a measure of available potential energy in the spatial domain. To do that, we construct a phase space of FT index and baroclinicity and study the average circulation traced by the time series for the two dynamical variables. The resulting phase portrait is shown in Figure 2. For technical details on phase space analysis refer to Novak et al. (2017), while for more examples of its use see Marcheggiani and Ambaum (2020) or Yano et al. (2020). We observe that, on average, baroclinicity is strongly depleted during events of strong F’—T’ covariance and it recovers primarily when covariance is weak. This points to the idea that events of strong thermal coupling between the surface and the lower troposphere are on average associated with a reduction in baroclinicity, thus acting as a sink of energy in the evolution of storms and, more generally, storm tracks.

Upon investigation of the driving mechanisms that lead to a strong F’—T’ spatial covariance, we find that increases in variances and correlation are equally important and that appears to be a more general feature of heat fluxes in the atmosphere, as more recent results appear to indicate (which is the focus of the second part of my PhD).

In the case of surface heat fluxes, cold sector dynamics play a fundamental role in driving the increase of correlation: when cold air is advected over the ocean surface, flux variance amplifies in response to the stark temperature contrasts at the air—sea interface as the ocean surface temperature field features a higher degree of spatial variability linked to the presence of both the Gulf Stream on the large scale and oceanic eddies on the mesoscale (up to 100 km).

The growing relative importance of the cold sector in the intensification phase of the F’—T’ spatial covariance can also be revealed by looking at the phase portraits for air temperature and cold sector area fraction, which is shown in Figure 3. These phase portraits tell us how these fields vary at different points in the phase space of surface heat flux and air temperature spatial standard deviations (which correspond to the horizontal and vertical axes, respectively). Lower temperatures and larger cold sector area fraction characterise the increase in covariance, while the opposite trend is observed in the decaying stage.

Surface heat fluxes eventually trigger an increase in temperature variance, which within the atmospheric boundary layer follows an almost adiabatic vertical profile which is characteristic of the mixed layer (Stull, 2012).

Figure 4: Diagram of the effect of the atmospheric boundary layer height on modulating surface heat flux—temperature correlation.

Stronger surface heat fluxes are associated with a deeper boundary layer reaching higher levels into the troposphere: this could explain the observed increase in correlation as the lower-tropospheric air temperatures become more strongly coupled with the surface, while a lower correlation with the surface ensues when the boundary layer is shallow and surface heat flux are weak. Figure 4 shows a simple diagram summarising the mechanisms described above.

In conclusion, we showed that surface heat fluxes locally can have a damping effect on the evolution of mid-latitude weather systems, as the covariation of surface heat flux and air temperature in the lower troposphere corresponds with a decrease in the available potential energy.

Results indicate that most of this thermodynamically active heat exchange is realised within the cold sector of weather systems, specifically as the atmospheric boundary layer deepens and exerts a deeper influence upon the tropospheric circulation.

References

  • Athanasiadis, P. J. and Ambaum, M. H. P.: Linear Contributions of Different Time Scales to Teleconnectivity, J. Climate, 22, 3720– 3728, 2009.
  • Chang, E. K., Lee, S., and Swanson, K. L.: Storm track dynamics, J. Climate, 15, 2163–2183, 2002.
  • Hoskins, B. J. and Valdes, P. J.: On the existence of storm-tracks, J. Atmos. Sci., 47, 1854–1864, 1990.
  • Lorenz, E. N.: Available potential energy and the maintenance of the general circulation, Tellus, 7, 157–167, 1955.
  • Marcheggiani, A. and Ambaum, M. H. P.: The role of heat-flux–temperature covariance in the evolution of weather systems, Weather and Climate Dynamics, 1, 701–713, 2020.
  • Novak, L., Ambaum, M. H. P., and Tailleux, R.: Marginal stability and predator–prey behaviour within storm tracks, Q. J. Roy. Meteorol. Soc., 143, 1421–1433, 2017.
  • Peixoto, J. P. and Oort, A. H.: Physics of climate, American Institute of Physics, New York, NY, USA, 1992.
  • Stull, R. B.: Mean boundary layer characteristics, In: An Introduction to Boundary Layer Meteorology, Springer, Dordrecht, Germany, 1–27, 1988.
  • Yano, J., Ambaum, M. H. P., Dacre, H., and Manzato, A.: A dynamical—system description of precipitation over the tropics and the midlatitudes, Tellus A: Dynamic Meteorology and Oceanography, 72, 1–17, 2020.

CMIP6 Data Hackathon

Brian Lo – brian.lo@pgr.reading.ac.uk 

Chloe Brimicombe – c.r.brimicombe@pgr.reading.ac.uk 

What is it?

A hackathon, from the words hack (meaning exploratory programming, not the alternate meaning of breaching computer security) and marathon, is usually a sprint-like event where programmers collaborate intensively with the goal of creating functioning software by the end of the event. From 2 to 4 June 2021, more than a hundred early career climate scientists and enthusiasts (mostly PhDs and Postdocs) from UK universities took part in a climate hackathon organised jointly by Universities of Bristol, Exeter and Leeds, and the Met Office. The common goal was to quickly analyse certain aspects of Climate Model Intercomparison Project 6 (CMIP6) data to output cutting-edge research that could be worked into a published material and shown in this year’s COP26. 

Before the event, attendees signed up to their preferred project from a choice of ten. Topics ranged from how climate change will affect migration of arctic terns to the effects of geoengineering by stratospheric sulfate injections and more… Senior academics from a range of disciplines and institutions led each project. 

Group photo of participants at the CMIP6 Data Hackathon

How is this virtual hackathon different to a usual hackathon? 

Like many other events this year, the hackathon took place virtually, using a combination of video conferencing (Zoom) for seminars and teamwork, and discussion forums (Slack). 

Brian: 

Compared to two 24-hour non-climate related hackathons I previously attended, this one was spread out for three days, so I managed not to disrupt my usual sleep schedules! The experience of pair programming with one or two other team members was as easy, since I shared one of my screens on Zoom breakout rooms throughout the event. What I really missed were the free meals, plenty of snacks and drinks usually on offer at normal hackathons to keep me energised while I programmed. 

Chloe:

I’ve been to a climate campaign hackathon before, and I did a hackathon style event to end a group project during the computer science part of my undergraduate; we made the boardgame buccaneer in java. But this was set out completely differently. And, it was not as time intensive as those which was nice. I missed not being in a room with those you are on a project with and still missing out on free food – hopefully not for too much longer. But we made use of Zoom and Slack for communication. And Jasmin and the version control that git offers with individuals working on branches and then these were merged at the end of the hackathon. 

What did we do? 

Brian: 

Project 2: How well do the CMIP6 models represent the tropical rainfall belt over Africa? 

Using Gaussian parameters in Nikulin & Hewitson 2019 to describe the intensity, mean meridional position and width of the tropical rainfall belt (TRB), the team I was in investigated three aspects of CMIP6 models for capturing the Africa TRB, namely the model biases, projections and whether there was any useful forecast information in CMIP6 decadal hindcasts. These retrospective forecasts were generated under the Decadal Climate Prediction Project (DCPP), with an aim of investigating the skill of CMIP models in predicting climate variations from a year to a decade ahead. Our larger group of around ten split ourselves amongst these three key aspects. I focused on aspect of CMIP6 decadal hindcasts, where I compared different decadal models at different model lead times with three observation sources. 

Chloe: 

Project 10: Human heat stress in a warming world 

Our team leader Chris had calculated the universal thermal climate index (UTCI) – a heat stress index for a bunch of the CMIP6 climate models. He was looking into bias correction against the ERA5 HEAT reanalysis dataset whilst we split into smaller groups. We looked at a range of different things from how the intensity of heat stress changed to how the UTCI compared to mortality. I ended up coding with one of my (5) PhD supervisors Claudia Di Napoli and we made amongst other things the gif below.  

https://twitter.com/ChloBrim/status/1400780543193649153
Annual means of the UTCI for RCP4.5 (medium emissions) projection from 2020 to 2099.

Would we recommend meteorology/climate-related hackathon? 

Brian: 

Yes! The three days was a nice break from my own radar research work. The event was nevertheless good training for thinking quickly and creatively to approach research questions other than those in my own PhD project. The experience also sharpened my coding and data exploration skills, while also getting the chance to quickly learn advanced methods for certain software packages (such as xarray and iris). I was amazed at the amount of scientific output achieved in only three short days! 

Chloe: 

Yes, but also make sure if it’s online you block out the time and dedicate all your focus to the hackathon. Don’t be like me. The hackathon taught me more about python handling of netcdfs, but I am not yet a python plotting convert, there are some things R is just nicer for. And I still love researching heat stress and heatwaves, so that’s good!  

We hope that the CMIP hackathon runs again next year to give more people the opportunity to get involved. 

How to write a PhD thesis during a global pandemic

Kaja Milczewska – k.m.milczewska@pgr.reading.ac.uk

Completing a PhD is a momentous task at the best of times, let alone in combination with a year-long global pandemic. Every PhD researcher is different, and as such, everyone has had different circumstantial struggles throughout Covid-19. The lack of human interaction that comes with working in a vibrant academic environment such as the Meteorology Department can make working from home a real struggle. Sometimes it is difficult to find the motivation to get anything useful done; whereas at other times you could squeeze five hours’ worth of work into one. Trying to stay organised is key to getting it done, therefore the following are some of the things that helped me get to the end of my PhD thesis – and it has not been easy. If you are still out there writing and finishing up experiments: read on! Maybe the result is that you might feel a little less alone. The PhD experience can be truly isolating at the best of times, so literally being instructed to isolate from the world is not ideal. The points are numbered for convenience of structuring this post, rather than any order of importance. 

  1. Communicate with your supervisor(s) 

It is tempting to “disappear off the radar” when things are not going well. You could wake up in the morning of the day of your regular weekly meeting, filled with dread that you have not prepared anything for it. Your brain recoils into the depths of your skull as your body recoils back under the safety of the duvet. What are your options? Some of them might be: take a deep gulp and force yourself out of bed with the prospect of coffee before the meeting (where you steer the conversation onto the things you did manage to do); or to postpone the meeting because you need to finish XYZ and thus a later meeting may be more productive; or ignore the meeting altogether. The first one is probably the best option, but it requires mental strength where there might be none. The second one is OK, but you still need to do the work. The last one is a big no. Don’t do it. 

Anxiety will make you believe that ignoring the world and all responsibilities is the most comfortable option in the moment, but the consequences of acting on it could be worse. Supervisors value honesty, and they know well that it is not always possible to complete all the scheduled tasks. Of course, if this happens every week then you might need to introspectively address the reasons for this, and – again, talking with your supervisor is usually a useful thing to do. You might not want them to know your entire life story, but it is helpful for everybody involved if they are aware that you struggle with anxiety / depression / ADHD / *insert any condition here*, which could affect your capacity to complete even the simplest, daily tasks. Being on the same page and having matching expectations is key to any student – supervisor partnership. 

  1.  Reward yourself for the things you have already accomplished. 

Whether that’s mid-week, mid-to-do-list, weekend — whenever. List all the things you have done regularly (either work- or life-related) and recognise that you are trying to survive a pandemic. And trying to complete the monstrous task of writing a PhD thesis. Those are big asks, and the only way to get through them is to break them down into smaller chunks. Putting down “Write thesis” on your to-do list is more likely to intimidate than motivate you. How about breaking it down further: “Re-create plot 4.21”, or “Consolidate supervisor comments on pages 21 – 25” — these are achievable things in a specified length of time. It also means you could tick them off more easily, hopefully resulting in feeling accomplished. Each time this happens, reward yourself in whatever way makes you feel nice. Even just giving yourself a literal pat on the shoulder could feel great – try it! 

  1. Compile supervisor feedback / comments into a spreadsheet  

An Excel spreadsheet – or any other suitable system – will enable you to keep track of what still needs addressing and what has been completed. The beauty of using a colour-coded spreadsheet for feedback comments is that once the required corrections are completed, you have concrete evidence of how much you have already achieved – something to consult if you start feeling inadequate at any point (see previous section!). I found this a much easier system than writing it down in my workbook, although of course this does work for some people, too. Anytime you receive feedback on your work – written or otherwise – note them down. I used brief reminders, such as “See supervisor’s comment on page X” but it was useful to have them all compiled together. Also, I found it useful to classify the comments into ‘writing-type’ corrections and ‘more work required’ corrections. The first one is self-explanatory: these were typos, wrong terminologies, mistakes in equations and minor structural changes. The ‘more work required’ was anything that required me to find citations / literature, major structural changes, issues with my scientific arguments or anything else that required more thought. This meant that if my motivation was lacking, I could turn to the “writing-type” comments and work on them without needing too much brain power. It also meant that I could prioritise the major comments first, which made working to a deadline a little bit easier. 

  1. Break down how long specific things will take 

This is most useful when you are a few weeks away from submission date. With only 5 weeks left, my colour-coded charts were full of outstanding comments; neither my ‘Conclusions’ chapter nor my Abstract had been written; plots needed re-plotting and I still did not know the title of my thesis. Naturally, I was panicking. I knew that the only way I could get through this was to set a schedule — and stick to it. At the time, there were 5 major things to do: complete a final version of each of my 5 thesis chapters. A natural split was to allow each chapter only one week for completion. If I was near to running over my self-prescribed deadline, I would prioritise only the major corrections. If still not done by the end of the allowed week: that’s it! Move on. This can be difficult for any perfectionists out there, but by this point the PhD has definitely taught me that “done” is better than perfect. I also found that some chapters took less time to finish than others, so I had time to return to the things I left not quite finished. Trust yourself, and give it your best. By all means, push through the hardest bit to the end, but remember that there (probably) does not exist a single PhD thesis without any mistakes. 

5. Follow useful Twitter threads 

There exist two groups of people: those who turn off or deactivate all social media when they need to focus on a deadline, and those who get even more absorbed by its ability to divert your attention away from the discomfort of the dreaded task at hand. Some might call it “productive procrastination”. I actually found that social media helped me a little – but only when my state of mind was such that I could resist the urge to fall down a scrolling rabbit hole. If you are on Twitter, you might find hashtags like #phdchat and accounts such as @AcademicChatter , @phdforum @phdvoice useful. 

6. Join a virtual “writing room” 

On the back of the last tip, I have found a virtual writing room helpful for focus. The idea is that you join an organised Zoom meeting full of other PhDs, all of whom are writing at the same time. All microphones are muted, but the chat is active so it is nice to say ‘hello!’ to someone else writing at the same time, anywhere else in the world. The meetings have scheduled breaks, with the organiser announcing when they occur. I found that because I actively chose to be up and start writing at the very early hour of 6am by attending the virtual writing room, I was not going to allow myself to procrastinate. The commitment to being up so early and being in a room full of people also doing the same thing (but virtually, obviously) meant that those were the times that I was probably the most focused. These kinds of rooms are often hosted by @PhDForum on Twitter; there could also be others. An alternative idea could be to set up a “writing meeting” with your group of peers and agree to keep chatter to a minimum (although this is not something I tried myself). 

7. Don’t look at the news 

Or at least, minimise your exposure to them. It is generally a good thing to stay on top of current events, but the final stages of writing a PhD thesis are probably unlike any other time in your life. You need the space and energy to think deeply about your own work right now. Unfortunately, I learnt this the hard way and found that there were days where I could do very little work because my brain was preoccupied with awful events happening around the world. It made me feel pathetic, routinely resulting in staying up late to try and finish whatever I failed to finish during the day. This only deteriorated my wellbeing further with shortened sleep and a constant sense of playing “catch-up”. If this sounds like you, then try switching off your news notifications on your phone or computer, or limit yourself to only checking the news homepage once a day at a designated time.  

8. Be honest when asked about how you are feeling 

Many of us tend to downplay or dismiss our emotions. It can be appealing to keep your feelings to yourself, saving yourself the energy involved in explaining the situation to whomever asked. You might also think that you are saving someone else the hassle of worrying about you. The trouble is that if we continuously paper over the cracks in our mental wellbeing within the handful of conversations we are having (which are especially limited during the pandemic), we could stop acknowledging how we truly feel. This does not necessarily mean spilling all the beans to whomever asked the innocent question, “How are you?”. But the catharsis from opening up to someone and acknowledging that things are not quite right could really offload some weight off your shoulders. If the person on the other end is your PhD supervisor, it can also be helpful for them to know that you are having a terrible time and are therefore unable to complete tasks to your best ability. Submission anxiety can be crippling for some people in the final few weeks, and your supervisor just won’t be able to (and shouldn’t) blindly assume how your mental health is being affected by it, because everyone experiences things differently. This goes back to bullet no.1. 

Hopefully it goes without saying that the above are simply some things that helped me through to the end of the thesis, but everybody is different. I am no counsellor or wellbeing guru; just a recently-finished PhD! Hopefully the above points might offer a little bit of light for anyone else struggling through the storm of that final write-up. Keep your chin up and, as Dory says: just keep swimming. Good luck! 

Better Data… with MetaData!

James Fallon – j.fallon@pgr.reading.ac.uk

As researchers, we familiarise ourselves with many different datasets. Depending on who put together the dataset, the variable names and definitions that we are already familiar from one dataset may be different in another. These differences can range from subtle annoyances to large structural differences, and it’s not always immediately obvious how best to handle them.

One dataset might be on an hourly time-index, and the other daily. The grid points which tell us the geographic location of data points may be spaced at different intervals, or use entirely different co-ordinate systems!

However most modern datasets come with hidden help in the form of metadata – this information should tell us how the data is to be used, and with the right choice of python modules we can use the metadata to automatically work with different datasets whilst avoiding conversion headaches.

First attempt…

Starting my PhD, my favourite (naïve, inefficient, bug prone,… ) method of reading data with python was with use of the built-in function open() or numpy functions like genfromtxt(). These are quick to set up, and can be good enough. But as soon as we are using data with more than one field, complex coordinates and calendar indexes, or more than one dataset, this line of programming becomes unwieldy and disorderly!

>>> header = np.genfromtxt(fname, delimiter=',', dtype='str', max_rows=1)
>>> print(header)
['Year' 'Month' 'Day' 'Electricity_Demand']
>>> data = np.genfromtxt(fnam, delimiter=',', skip_header=1)
>>> print(data)
array([[2.010e+03, 1.000e+00, 1.000e+00, 0.000e+00],
       [2.010e+03, 1.000e+00, 2.000e+00, 0.000e+00],
       [2.010e+03, 1.000e+00, 3.000e+00, 0.000e+00],
       ...,
       [2.015e+03, 1.200e+01, 2.900e+01, 5.850e+00],
       [2.015e+03, 1.200e+01, 3.000e+01, 6.090e+00],
       [2.015e+03, 1.200e+01, 3.100e+01, 6.040e+00]])

The above code reads in year, month, day data in the first 3 columns, and Electricity_Demand in the last column.

You might be familiar with such a workflow – perhaps you have refined it down to a fine art!

In many cases this is sufficient for what we need, but making use of already available metadata can make the data more readable, and easier to operate on when it comes to complicated collocation and statistics.

Enter pandas!

Pandas

In the previous example, we read in our data to numpy arrays. Numpy arrays are very useful, because they store data more efficiently than a regular python list, they are easier to index, and have many built in operations from simple addition to niche linear algebra techniques.

We stored column labels in an array called header, but this means our metadata has to be handled separately from our data. The dates are stored in three different columns alongside the data – but what if we want to perform an operation on just the data (for example add 5 to every value). It is technically possible but awkward and dangerous – if the column index changes in future our code might break! We are probably better splitting the dates into another separate array, but that means more work to record the column headers, and an increasing number of python variables to keep track of.

Using pandas, we can store all of this information in a single object, and using relevant datatypes:

>>> data = pd.read_csv(fname, parse_dates=[['Year', 'Month', 'Day']], index_col=0)
>>> data
Electricity_Demand
Year_Month_Day      
2010-01-01      0.00
2010-01-02      0.00
2010-01-03      0.00
2010-01-04      0.00
2010-01-05      0.00
...              ...
2015-12-27      5.70
2015-12-28      5.65
2015-12-29      5.85
2015-12-30      6.09
2015-12-31      6.04

[2191 rows x 1 columns]

This may not immediately appear a whole lot different to what we had earlier, but notice the dates are now saved in datetime format, whilst being tied to the data Electricity_Demand. If we want to index the data, we can simultaneously index the time-index without any further code (and possible mistakes leading to errors).

Pandas also makes it really simple to perform some complicated operations. In this example, I am only dealing with one field (Electricity_Demand), but this works with 10, 100, 1000 or more columns!

  • Flip columns with data.T
  • Calculate quantiles with data.quantile
  • Cut to between dates, eg. data.loc['2010-02-03':'2011-01-05']
  • Calculate 7-day rolling mean: data.rolling(7).mean()

We can insert new columns, remove old ones, change the index, perform complex slices, and all the metadata stays stuck to our data!

Whilst pandas does have many maths functions built in, if need-be we can also export directly to numpy using numpy.array(data['Electricity_Demand']) or data.to_numpy().

Pandas can also simplify plotting – particularly convenient when you just want to quickly visualise data without writing import matplotlib.pyplot as plt and other boilerplate code. In this example, I plot my data alongside its 7-day rolling mean:

ax = data.loc['2010'].plot(label='Demand', ylabel='Demand (GW)')
data.loc['2010'].rolling(7).mean().plot(ax=ax, label='Demand rolling mean')
ax.legend()

Now I can visualise the anomalous values at the start of the dataset, a consistent annual trend, a diurnal cycle, and fairly consistent behaviour week to week.

Big datasets

Pandas can read from and write to many different data formats – CSV, HTML, EXCEL, … but some filetypes like netCDF4 that meteorologists like working with aren’t built in.

xarray is an extremely versatile tool that can read in many formats including netCDF, GRIB. As well as having built in functions to export to pandas, xarray is completely capable of handling metadata on its own, and many researchers work directly with objects such as xarray DataArray objects.

There are more xarray features than stars in the universe[citation needed], but some that I find invaluable include:

open_mfdataset – automatically merge multiple files (eg. for different dates or locations)
assign_coords – replace one co-ordinate system with another
where – replace xarray values depending on a condition

Yes you can do all of this with pandas or numpy. But you can pass metadata attributes as arguments, for example we can get the latitude average with my_data.mean('latitude'). No need to work in indexes and hardcoded values – xarray can do all the heavy lifting for you!

Have more useful tips for working effectively with meteorological data? Leave a comment here or send me an email j.fallon@pgr.reading.ac.uk 🙂