Better Data… with MetaData!

James Fallon –

As researchers, we familiarise ourselves with many different datasets. Depending on who put together the dataset, the variable names and definitions that we are already familiar from one dataset may be different in another. These differences can range from subtle annoyances to large structural differences, and it’s not always immediately obvious how best to handle them.

One dataset might be on an hourly time-index, and the other daily. The grid points which tell us the geographic location of data points may be spaced at different intervals, or use entirely different co-ordinate systems!

However most modern datasets come with hidden help in the form of metadata – this information should tell us how the data is to be used, and with the right choice of python modules we can use the metadata to automatically work with different datasets whilst avoiding conversion headaches.

First attempt…

Starting my PhD, my favourite (naïve, inefficient, bug prone,… ) method of reading data with python was with use of the built-in function open() or numpy functions like genfromtxt(). These are quick to set up, and can be good enough. But as soon as we are using data with more than one field, complex coordinates and calendar indexes, or more than one dataset, this line of programming becomes unwieldy and disorderly!

>>> header = np.genfromtxt(fname, delimiter=',', dtype='str', max_rows=1)
>>> print(header)
['Year' 'Month' 'Day' 'Electricity_Demand']
>>> data = np.genfromtxt(fnam, delimiter=',', skip_header=1)
>>> print(data)
array([[2.010e+03, 1.000e+00, 1.000e+00, 0.000e+00],
       [2.010e+03, 1.000e+00, 2.000e+00, 0.000e+00],
       [2.010e+03, 1.000e+00, 3.000e+00, 0.000e+00],
       [2.015e+03, 1.200e+01, 2.900e+01, 5.850e+00],
       [2.015e+03, 1.200e+01, 3.000e+01, 6.090e+00],
       [2.015e+03, 1.200e+01, 3.100e+01, 6.040e+00]])

The above code reads in year, month, day data in the first 3 columns, and Electricity_Demand in the last column.

You might be familiar with such a workflow – perhaps you have refined it down to a fine art!

In many cases this is sufficient for what we need, but making use of already available metadata can make the data more readable, and easier to operate on when it comes to complicated collocation and statistics.

Enter pandas!


In the previous example, we read in our data to numpy arrays. Numpy arrays are very useful, because they store data more efficiently than a regular python list, they are easier to index, and have many built in operations from simple addition to niche linear algebra techniques.

We stored column labels in an array called header, but this means our metadata has to be handled separately from our data. The dates are stored in three different columns alongside the data – but what if we want to perform an operation on just the data (for example add 5 to every value). It is technically possible but awkward and dangerous – if the column index changes in future our code might break! We are probably better splitting the dates into another separate array, but that means more work to record the column headers, and an increasing number of python variables to keep track of.

Using pandas, we can store all of this information in a single object, and using relevant datatypes:

>>> data = pd.read_csv(fname, parse_dates=[['Year', 'Month', 'Day']], index_col=0)
>>> data
2010-01-01      0.00
2010-01-02      0.00
2010-01-03      0.00
2010-01-04      0.00
2010-01-05      0.00
...              ...
2015-12-27      5.70
2015-12-28      5.65
2015-12-29      5.85
2015-12-30      6.09
2015-12-31      6.04

[2191 rows x 1 columns]

This may not immediately appear a whole lot different to what we had earlier, but notice the dates are now saved in datetime format, whilst being tied to the data Electricity_Demand. If we want to index the data, we can simultaneously index the time-index without any further code (and possible mistakes leading to errors).

Pandas also makes it really simple to perform some complicated operations. In this example, I am only dealing with one field (Electricity_Demand), but this works with 10, 100, 1000 or more columns!

  • Flip columns with data.T
  • Calculate quantiles with data.quantile
  • Cut to between dates, eg. data.loc['2010-02-03':'2011-01-05']
  • Calculate 7-day rolling mean: data.rolling(7).mean()

We can insert new columns, remove old ones, change the index, perform complex slices, and all the metadata stays stuck to our data!

Whilst pandas does have many maths functions built in, if need-be we can also export directly to numpy using numpy.array(data['Electricity_Demand']) or data.to_numpy().

Pandas can also simplify plotting – particularly convenient when you just want to quickly visualise data without writing import matplotlib.pyplot as plt and other boilerplate code. In this example, I plot my data alongside its 7-day rolling mean:

ax = data.loc['2010'].plot(label='Demand', ylabel='Demand (GW)')
data.loc['2010'].rolling(7).mean().plot(ax=ax, label='Demand rolling mean')

Now I can visualise the anomalous values at the start of the dataset, a consistent annual trend, a diurnal cycle, and fairly consistent behaviour week to week.

Big datasets

Pandas can read from and write to many different data formats – CSV, HTML, EXCEL, … but some filetypes like netCDF4 that meteorologists like working with aren’t built in.

xarray is an extremely versatile tool that can read in many formats including netCDF, GRIB. As well as having built in functions to export to pandas, xarray is completely capable of handling metadata on its own, and many researchers work directly with objects such as xarray DataArray objects.

There are more xarray features than stars in the universe[citation needed], but some that I find invaluable include:

open_mfdataset – automatically merge multiple files (eg. for different dates or locations)
assign_coords – replace one co-ordinate system with another
where – replace xarray values depending on a condition

Yes you can do all of this with pandas or numpy. But you can pass metadata attributes as arguments, for example we can get the latitude average with my_data.mean('latitude'). No need to work in indexes and hardcoded values – xarray can do all the heavy lifting for you!

Have more useful tips for working effectively with meteorological data? Leave a comment here or send me an email 🙂

Weather Variability and its Energy Impacts

James Fallon & Brian Lo – 

One in five people still do not have access to modern electricity supplies, and almost half the global population rely on burning wood, charcoal or animal waste for cooking and eating (Energy Progress Report). Having a reliable and affordable source of energy is crucial to human wellbeing: including healthcare, education, cooking, transport and heating. 

Our worldwide transition to renewable energy faces the combined challenge of connecting neglected regions and vulnerable communities to reliable power supplies, and also decarbonising all energy. An assessment on supporting the world’s 7 billion humans to live a high quality of life within planetary boundaries calculated that resource provisioning across sectors including energy must be restructured to enable basic needs to be met at a much lower level of resource use [O’Neill et al. 2018]. 

Adriaan Hilbers recently wrote for the Social Metwork about the renewable energy transition (Why renewables are difficult), and challenges and solutions for modern electricity grids under increased weather exposure. (Make sure to read that first, as it provides an important background for problems associate with meso to synoptic scale variability that we won’t cover here!)  

In this blog post, we highlight the role of climate and weather variability in understanding the risks future electricity networks face. 

Climate & weather variability 

Figure 1 – Stommel diagram of the Earth’s atmosphere 

A Stommel diagram [Stommel, 1963] is used to categorise climate and weather events of different temporal and spatial scales. Logarithmic axes describe time period and size; contours (coloured areas) depict the spectral intensity of variation in sea level. It allows us to identify a variety of dynamical features in the oceans that traverse magnitudes of spatial and temporal scales. Figure 1 is a Stommel diagram adapted to describe the variability of our atmosphere.  

Microscale Smallest scales to describe features generally of the order 2 km or smaller 
Mesoscale Scale for describing atmospheric phenomena having horizontal scales ranging from a few to several hundred kilometres 
Synoptic Largest scale used to describe meteorological phenomena, typically high hundreds or 1000 km or more 

Micro Impacts on Energy 

Microscale weather processes include more predictable phenomena such as heat and moisture flux events, and unpredictable turbulence events. These generally occur at scales much smaller than the grid scale represented in numerical weather prediction models, and instead are represented through parametrisation. The most important microscale weather impacts are for isolated power grids (for example a community reliant on solar power and batteries, off-grid). Microscale weather events can also make reliable supply difficult for grids reliant on a few geographically concentrated renewable energy supplies. 

Extended Range Weather Impacts on Energy 

Across the Stommel diagram, above the synoptic scale are seasonal and intraseasonal cycles, decadal and climate variations. 

Subseasonal-to-Seasonal (S2S) forecasts are an exciting development for decision-makers across a diverse range of sectors – including agriculture, hydrology, the humanitarian sector [White et al. 2017]. In the energy sector, skilful subseasonal energy forecasts are now production ready (S2S4E DST).  Using S2S forecasts can help energy users anticipate electricity demand peaks and troughs, levels of renewable production, and their combined impacts several weeks in advance. Such forecasts will have an increasingly important role as more countries have higher renewable energy penetration (increasing their electricity grid’s weather exposure). 

Decadal Weather Cycle and Climate Impacts on Energy 

Energy system planners and operators are increasingly trying to address risks posed by climate variability, climate change, and climate uncertainty.  

Figure 2 was constructed from the record of Central England temperatures spanning the years of 1659 to 1995 and highlights the modes of variability in our atmosphere on the order of 5 to 50 years. Even without the role of climate change, constraining the boundary conditions of our weather and climate is no small task. The presence of meteorologically impactful climate variability at many different frequencies increases the workload for energy modellers, requiring many decades of climate data in order to understand the true system boundaries. 

Figure 2  – Power spectra of central England from mid 17th century, explaining variability with physical phenomena [Ghil and Lucarini 2020]

When making models of regional, national or continental energy networks, it is now increasingly common for energy modellers to consider several decades of climate data, instead of sampling a small selection of years. Figure 2 shows the different frequencies of climate variability – relying on only a limited few years of data cannot explore the extent of this variability. However significant challenges remain in sampling long-term variability and change in models [Hilbers et al. 2019], and it is the role of weather and climate scientists to communicate the importance of addressing this. 

Important contributions to uncertainty in energy system planning don’t just come from weather and climate. Variability in future energy systems will depend on technological, socioeconomic and political outcomes. Predictions of which future technologies and approaches will be most sustainable and economical are not always clear cut and easy to anticipate. A virtual workshop hosted by Reading’s energy-met group last summer [Bloomfield et al. 2020] facilitated discussions between energy and climate researchers. The workshop identified the need to better understand how contributions of all these different uncertainties propagate through complex modelling chains. 

An Energy-Meteorologist’s Journey through Time and Space 

Research is underway into tackling the uncertainties and understanding of energy risks and impacts across the spectra of spatial and temporal scales. But understanding of energy systems, and successful future planning requires decision-making involving a broad (and perhaps not fully identified) group of important technological and other factors, as well as the weather and climate impacts. It is not enough to consider any one of these alone! It is vital experts across different fields collaborate on working towards what will be best for our future energy grids. 

Tracking SDG7 – The Energy Progress Report 

Why renewables are difficult – Adriaan Hilbers Social Metwork 2021

O’Neill, D.W., Fanning, A.L., Lamb, W.F. et al. A good life for all within planetary boundaries 

Stommel, H., 1963. Varieties of oceanographic experience. Science, 139(3555), pp.572-576.

White et al (2017) Potential applications of subseasonal‐to‐seasonal (S2S) predictions 

M Ghil, V Lucarini (2020) The physics of climate variability and climate change

AP Hilbers, DJ Brayshaw, A Gandy (2019) Importance subsampling: improving power system planning under climate-based uncertainty 

Bloomfield, H. et al. (2020) The importance of weather and climate to energy systems: a workshop on next generation challenges in energy-climate modelling 

The Social Metwork in 2020

James Fallon –
Brian Lo – 

Hello dear readers! Reviewing submissions and discovering the fascinating research that takes place in Reading Meteorology has been an amazing experience, and a personal highlight of the year!

Thank you to everyone who has contributed to the social metwork this year, and especially to those who have been patient whilst myself and Brian have been getting used to our new roles as co-editors. The quality of submissions has been very high, but don’t let that deter you if you haven’t written for the blog before! Writing for the social metwork is not as tricky as you might think – we promise!

At the time of writing, the blog has had over 5550 visitors, and is on track for an all time high by the end of the year. We hope that the social metwork has contributed to lifting spirits and continuing the met department social atmosphere throughout the year. In case you missed any posts, or want a second look at some, here is a list of all the posts from this year:

North American weather regimes and the stratospheric polar vortex – Simon Lee
Evaluating ocean eddies in coupled climate simulations on a global scale – Sophia Moreton
The (real) butterfly effect: the impact of resolving the mesoscale range – Tsz Yan Leung

Life on Industrial Placement – Holly Turner
An inter-comparison of Arctic synoptic scale storms between four global reanalysis datasets – Alec Vessey
A new, explicit thunderstorm electrification scheme for the Met Office Unified Model – Ben Courtier

Relationships in errors between meteorological forecasts and air quality forecasts – Kaja Milczewska
Tips for working from home as a PhD student – Simon Lee

Air pollution and COVID-19: is ozone an undercover criminal? – Kaja Milczewska
The philosophy of climate science – Mark Prosser
Explaining complicated things with simple words: Simple writer challenge – Linda Toča

Methane’s Shortwave Radiative Forcing – Rachael Byrom

How do ocean and atmospheric heat transports affect sea-ice extent? – Jake Aylmer

A Journey through Hot British Summers – Simon Lee
Exploring the impact of variable floe size on the Arctic sea ice – Adam Bateson

How Important are Post-Tropical Cyclones to European Windstorm Risk? – Elliott Sainsbury
The Scandinavia-Greenland Pattern: something to look out for this winter – Simon Lee

My journey to Reading: Going from application to newly minted SCENARIO PhD student – George Gunn
The visual complexity of coronal mass ejections follows the solar cycle – Shannon Jones
Organising a virtual conference – Gwyneth Matthews
Visiting Scientist Week Preview: Laure Zanna – Kaja Milczewska

Demonstrating as a PhD student in unprecedented times – Brian Lo
ECMWF/EUMETSAT NWP SAF Workshop on the treatment of random and systematic errors in satellite data assimilation for NWP – Devon Francis
Extra conference funding: how to apply and where to look – Shannon Jones
Youth voices pick up the slack: MOCK COP 26 – James Fallon

Enjoy the panto, have a very merry Christmas, and here’s to 2021!
From your metwork co-editors James & Brian!

Youth voices pick up the slack: MOCK COP 26

James Fallon –

This year’s Conference of the Parties (COP) should have taken place earlier in November, hosted by the UK in Glasgow and in partnership with Italy. Despite many global events successfully moving online this year, from film festivals to large conferences such as the EGU general assembly, the international climate talks were postponed until November 2021.

But young people around the world are more engaged than ever before with the urgent need for international cooperation in the face of the climate emergency. The Fridays for Future (FFF) movement has recorded participation since late 2018 of more than 13,000,000 young people, in 7500 cities from all continents. FFF has adapted to the covid-19 crisis, and on 25th September this year participants from over 150 countries took part both online and in the streets, highlighting the Most Affected People and Areas (MAPA).

Unimpressed by the delay of important climate talks and negotiations, students and youth activists from FFF and a multitude of groups and movements have initiated the MOCK COP26, a 2-week online global conference on climate change that mirrors the real COP.

“My country, the Philippines, is struggling. We don’t want more floods that rise up to 15 feet, winds that peel off roofs in seconds, the rain that drowns our pets and livestock, and storm surges that ravage coastal communities. We don’t want more people to die. We’re still a developing country that contributes so little to global carbon emissions yet we face the worst of its consequences. This is absurd! 

Angelo, Philippines


Organisers have chosen five themes to focus on:

  1. Climate education
  2. Climate justice
  3. Climate resilient livelihoods
  4. Health and wellbeing
  5. Nationally Determined Contributions

Full programme here:

Over a dozen academic support videos break down complicated topics such as “The Kyoto Protocol”, “Agriculture and Agribusiness”, and the “History of Climate Negotiation”. These videos are helping youth delegates and all participants to understand what happens at a COP summit.

Panel sessions have featured United Nations Youth Envoy Jayathma Wickramanayake, 9 year old Climate & Environmental Activist Licypriya Kangujam, and (actual) COP26 president Alok Sharma.

High Level Country Statements

A unique aspect of MOCK COP that I have been excitedly anticipating is the high level country statements; each a 3 minute speech given by youth climate activists representing their nation.

Mock COP26 is not dominated by big polluters as COP26 is. We believe that we need to amplify the people on the frontlines of climate change, which is why we will be aiming to, throughout Mock COP, uplift the voices of those from MAPA (Most Affected People and Areas) countries above those from the Global North. This is why Mock COP26 is special.

Jamie Burrell, UK

Youth delegates have been encouraged to give speeches in whichever language they are most comfortable talking. At the time of writing, subtitles don’t appear to be fully functioning. However a large number of talks are given in English, and transcripts of all talks have been made available here:

I highly recommend setting some time aside to give these speeches a listen. Although the total number might put you off, it is very easy to jump in and out of talks. You can find videos embedded below, or on the official youtube channel.


Pick: Two youth delegates represent Morocco. Whilst Morocco has been ranked a role model for climate action, the reality of the country’s future is alarming. Globally the most affected are the least protected. It’s time for world leaders to protect everyone.


Pick: The delegate for Suriname explains risks faced as a Small Island Developing State (SIDS) with infrastructure near the coast. Suriname must implement climate adaptation whilst enhancing its legislation in forestry, mining, and agriculture.


Pick: Indonesia’s delegate opens with the stark warning that the country will lose 1500 of its islands due to rising sea levels by 2050. The high level statement includes calls to incorporate climate education into the national curriculum, and find ways to protect natural habitat. Indonesia has the 2nd biggest rainforest in the world, but currently has no agreed emissions reductions pathway.


Pick: Ireland’s youth delegates present a necessarily progressive 5 year plan to stick to the EU target of reducing emissions by at least 65% by 2030. The need for much stronger climate education, and providing access to affordable and sustainable energy, are among many other commitments.


Pick: The year started with forest fires devastating large swathes of Australia’s natural habitats. Youth delegates want their nation to lead the world as a renewable energy exporter, and an overhaul of media rules to foster new diverse media outlets and prevent monopolies that currently stall climate action.

What is the hoped outcome?

With so many connected issues relating to the climate and ecological emergency, previous COPs have often seen negotiations stall and agreements postponed. The complexity of tackling this crisis is compounded by the vested interests of powerful governments and coal, oil, and gas profiteers.

But youth messages can be heard loud and clear at MOCK COP 26, reflecting the 5 themes of the conference.

We demand concrete action, not mere promises. It’s time for our leaders to wake up, prioritize the realization of the Green Deal, and cut carbon emissions. 

We won’t have more time to alter the effects of the climate crisis if we let this opportunity pass. The clock is ticking. The time for action is NOW. 

In the wake of covid-19 induced economic shocks, policy makers must ensure genuine green recovery that engages with ideas of global climate justice.

Youth delegate panels will continue over the weekend, working towards the creation of a final statement outlining their demands for world leaders. This will be presented to High Level Climate Action Champion for COP26 Nigel Topping, at the closing ceremony (12:00 GMT Tuesday 1st December)