## Dialogue Concerning the Obvious and Obscurity of Scientific Programming in Python: A Lost Script

Disclaimer: The characters and events depicted in this blog post are entirely fictitious. Any similarities to names, incidents or source code are entirely coincidental.

Antonio (Professor): Welcome to the new Masters module and PhD training course, MTMX101: Introduction to your worst fears of scientific programming in Python. In this course, you will be put to the test–to spot and resolve the most irritating ‘bugs’ that will itch, sorry, gl-itch your software and intended data analysis.

Berenice (A PhD student): Before we start, would you have any tips on how to do well for this module?

Antonio: Always read the code documentation. Otherwise, there’s no reason for code developers to host their documentation on sites like rtfd.org (Read the Docs).

Cecilio (An MSc student): But… didn’t we already do an introductory course on scientific computing last term? Why this compulsory module?

Antonio: The usual expectation is for you to have to completed last term’s introductory computing module, but you may also find that this course completely changes what you used to consider to be your “best practices”… In other words, it’s a bad thing that you have taken that module last term but also a good thing you have taken that module last term. There may be moments where you may think “Why wasn’t I taught that?” I guess you’ll all understand soon enough! As per the logistics of the course, you will be assessed in the form of quiz questions such as the following:

Example #0: The Deviations in Standard Deviation

Will the following print statements produce the same numeric values?

import numpy as np
import pandas as pd

p = pd.Series(np.array([1,1,2,2,3,3,4,4]))
n = np.array([1,1,2,2,3,3,4,4])

print(p.std())
print(n.std())

Example #1: The Sum of All Fears

Antonio: As we all know, numpy is an important tool in many calculations and analyses of meteorological data. Summing and averaging are common operations. Let’s import numpy as np and consider the following line of code. Can anyone tell me what it does?

>>> hard_sum = np.sum(np.arange(1,100001))

Cecilio: Easy! This was taught in the introductory course… Doesn’t this line of code sum all integers from 1 to 100 000?

Antonio: Good. Without using your calculators, what is the expected value of hard_sum?

Berenice: Wouldn’t it just be 5 000 050 000?

Antonio: Right! Just as quick as Gauss. Let’s now try it on Python. Tell me what you get.

Cecilio: Why am I getting this?

>>> hard_sum = np.sum(np.arange(1,100001))
>>> print(hard_sum)
705082704


Berenice: But I’m getting the right answer instead with the same code! Would it be because I’m using a Mac computer and my MSc course mate is using a Windows system?

>>> hard_sum = np.sum(np.arange(1,100001))
>>> print(hard_sum)
5000050000

Antonio: Well, did any of you get a RuntimeError, ValueError or warning from Python, despite the bug? No? Welcome to MTMX101!

Berenice: I recall learning something about the computer’s representation of real numbers in one of my other modules. Would this be problem?

Antonio: Yes, I like your thinking! But that still doesn’t explain why you both got different values in Python. Any deductions…? At this point, I will usually set this as a homework assignment, but since it’s your first MTMX101 lecture, here is the explanation on Section 2.2.5 of your notes.

If we consider the case of a 4-bit binary, and start counting from 0, the maximum number we can possibly represent is 15. Adding 1 to 15, would lead to 0 being represented, as shown in Figure 1. This is called integer overflow, just like how an old car’s analogue 4-digit odometer “resets” to zero after recording 9999 km. As for the problem of running the same code and getting different results on a Windows and Mac machine, a numpy integer array on Windows defaults to 32-bit integer, whereas it is 64-bit on Mac/Linux, and as expected, the 64-bit integer has more bits and can thus represent our expected value of 5000050000. So, how do we mitigate this problem when writing future code? Simply specify the type argument and force Python to use 64-bit integers when needed.

>>> hard_sum = np.sum(np.arange(1,100001), dtype=np.int64)
>>> print(type(hard_sum))
<class 'numpy.int64'>
>>> print(hard_sum)
5000050000

As to why we got the spurious value of 705082704 from using 32-bit integers, I will leave it to you to understand it from the second edition of my book, Python Puzzlers!

Figure 1: Illustration of overflow in a 4-bit unsigned integer

Example #2: An Important Pointer for Copying Things

Antonio: On to another simple example, numpy arrays! Consider the following two-dimensional array of temperatures in degree Celsius.

>>> t_degrees_original = np.array([[2,1,0,1], [-1,0,-1,-1], [-3,-5,-2,-3], [-5,-7,-6,-7]])

Antonio: Let’s say we only want the first three rows of data, and in this selection would like to set all values on the zeroth row to zero, while retaining the values in the original array. Any ideas how we could do that?

Cecilio: Hey! I’ve learnt this last term, we do array slicing.

>>> t_degrees_slice = t_degrees_original[0:3,:]
>>> t_degrees_slice[0,:] = 0

Antonio: I did say to retain the values in the original array…

>>> print(t_degrees_original)
[[ 0  0  0  0]
[-1  0 -1 -1]
[-3 -5 -2 -3]
[-5 -7 -6 -7]]

Cecilio: Oh oops.

Berenice: Let me suggest a better solution.

>>> t_degrees_original = np.array([[2,1,0,1], [-1,0,-1,-1], [-3,-5,-2,-3], [-5,-7,-6,-7]])
>>> t_degrees_slice = t_degrees_original[[0,1,2],:]
>>> t_degrees_slice[0,:] = 0
>>> print(t_degrees_original)
[[ 2  1  0  1]
[-1  0 -1 -1]
[-3 -5 -2 -3]
[-5 -7 -6 -7]]

Antonio: Well done!

Cecilio: What? I thought the array indices 0:3 and 0,1,2 would give you the same slice of the numpy array.

Antonio: Let’s clarify this. The former method of using 0:3 is standard indexing and only copies the information on where the original array was stored (i.e. “view” of the original array, or “shallow copy”), while the latter 0,1,2 is fancy indexing and actually makes a new separate array with the corresponding values from the original array (i.e. “deep copy”). This is illustrated in Figure 2 showing variables and their respective pointers for both shallow and deep copying. As you now understand, numpy, is really not as easy as pie…

Figure 2: Simplified diagram showing differences in variable pointers and computer memory for shallow and deep copying

Cecilio: That was… deep.

Berenice: Is there a better way to deep copy numpy arrays rather than having to type in each index like I did e.g. [0,1,2]?

Antonio: There is definitely a better way! If we replace the first line of your code with the line below, you should be able to do a deep copy of the original array. Editing the copied array will not affect the original array.

>>> t_degrees_slice = np.copy(t_degrees_original[0:2,:])

I would suggest this method of np.copy to be your one– and preferably only one –obvious way to do a deep copy of a numpy array, since it’s the most intuitive and human-readable! But remember, deep copy only if you have to, since deep copying a whole array of values takes computation time and space! It’s now time for a 5-minute break.

Cecilio: More like time for me to eat some humble (num)py.

Consider the following Python code:

short_a = "galileo galilei"
short_b = "galileo galilei"
long_a = "galileo galilei " + "linceo"
long_b = "galileo galilei " + "linceo"

print(short_a == short_b)
print(short_a is short_b)
print(long_a == long_b)
print(long_a is long_b)


Which the correct sequence of booleans that will be printed out?
1. True, True, True, True
2. True, False, True, False
3. True, True, True, False

Antonio: In fact, they are all correct answers. It depends on whether you are running Python 3.6.0, Python 3.8.5 and whether you ran the code in a script or in the console! Although there is much more to learn about “string interning”, the quick lesson here is to always compare the value of strings using double equal signs (==) instead of using is.

Example #3: Array manipulation – A Sleight of Hand?

Antonio: Let’s say you are asked to calculate the centered difference of some quantity (e.g. temperature) in one dimension $\frac{\partial T}{\partial x}$ with gird points uniformly separated by $\Delta x$ of 1 metre. What is some code that we could use to do this?

Berenice: I remember this from one of the modelling courses. We could use a for loop to calculate most elements of $\frac{\partial T}{\partial x} \approx \frac{T_{i+1} - T_{i-1}}{2\Delta x}$ then deal with the boundary conditions. The code may look something like this:

delta_x = 1.0
temp_x = np.random.rand(1000)
dtemp_dx = np.empty_like(temp_x)
for i in range(1, len(temp_x)-1):
dtemp_dx[i] = (temp_x[i+1] - temp_x[i-1]) / (2*delta_x)

# Boundary conditions
dtemp_dx[0] = dtemp_dx[1]
dtemp_dx[-1] = dtemp_dx[-2]

Antonio: Right! How about we replace your for loop with this line?

dtemp_dx[1:-1] = (temp_x[2:] - temp_x[0:-2]) / (2*delta_x)

Cecilio: Don’t they just do the same thing?

Antonio: Yes, but would you like to have a guess which one might be the “better” way?

Berenice: In last term’s modules, we were only taught the method I proposed just now. I would have thought both methods were equally good.

Antonio: On my computer, running your version of code 10000 times takes 6.5 seconds, whereas running my version 10000 times takes 0.1 seconds.

Cecilio: That was… fast!

Berenice: Only if my research code could run with that kind of speed…

Antonio: And that is what we call vectorisation, the act of taking advantage of numpy’s optimised loop implementation instead of having to write your own!

Cecilio: I wish we knew all this earlier on! Can you tell us more?

Antonio: Glad your interest is piqued! Anyway, that’s all the time we have today. For this week’s homework, please familiarise yourself so you know how to

import this

module or import that package. In the next few lectures, we will look at more bewildering behaviours such as the “Mesmerising Mutation”, the “Out of Scope” problem that is in the scope of this module. As we move to more advanced aspects of this course, we may even come across the “Dynamic Duck” and “Mischievous Monkey”. Bye for now!

## The Social Metwork in 2020

Hello dear readers! Reviewing submissions and discovering the fascinating research that takes place in Reading Meteorology has been an amazing experience, and a personal highlight of the year!

Thank you to everyone who has contributed to the social metwork this year, and especially to those who have been patient whilst myself and Brian have been getting used to our new roles as co-editors. The quality of submissions has been very high, but don’t let that deter you if you haven’t written for the blog before! Writing for the social metwork is not as tricky as you might think – we promise!

At the time of writing, the blog has had over 5550 visitors, and is on track for an all time high by the end of the year. We hope that the social metwork has contributed to lifting spirits and continuing the met department social atmosphere throughout the year. In case you missed any posts, or want a second look at some, here is a list of all the posts from this year:

March
Relationships in errors between meteorological forecasts and air quality forecasts – Kaja Milczewska
Tips for working from home as a PhD student – Simon Lee

May
Air pollution and COVID-19: is ozone an undercover criminal? – Kaja Milczewska
The philosophy of climate science – Mark Prosser
Explaining complicated things with simple words: Simple writer challenge – Linda Toča

June
Methane’s Shortwave Radiative Forcing – Rachael Byrom

July
How do ocean and atmospheric heat transports affect sea-ice extent? – Jake Aylmer

August
A Journey through Hot British Summers – Simon Lee
Exploring the impact of variable floe size on the Arctic sea ice – Adam Bateson

September
How Important are Post-Tropical Cyclones to European Windstorm Risk? – Elliott Sainsbury
The Scandinavia-Greenland Pattern: something to look out for this winter – Simon Lee

October
My journey to Reading: Going from application to newly minted SCENARIO PhD student – George Gunn
The visual complexity of coronal mass ejections follows the solar cycle – Shannon Jones
Organising a virtual conference – Gwyneth Matthews
Visiting Scientist Week Preview: Laure Zanna – Kaja Milczewska

Enjoy the panto, have a very merry Christmas, and here’s to 2021!
From your metwork co-editors James & Brian!

## Sea Ice-Ocean Feedbacks in the Antarctic Shelf Seas

Over the past forty years a small increasing trend in Antarctic sea ice extent has been observed. This is poorly understood, and currently not captured by global climate models which typically simulate a net decrease in Antarctic sea ice extent (Turner et al. 2013). The length of our observational time series in combination with our lack of confidence in global climate model results makes it difficult to assess whether the recent decline of Antarctic sea ice observed in 2016 and 2017 is the start of a new declining trend or just part of natural variability.

The net increase in Antarctic sea ice extent is the sum of stronger, but opposing, regional and highly seasonal trends as shown in Figure 2 (Holland, 2014). The trends grow throughout the spring resulting in the maximum trends in the summer, decaying away throughout the autumn to give negligible trends in the winter. This seasonality implies the role of feedbacks in modulating the observed trends.

We have used a highly simplified coupled sea ice—mixed layer model (a schematic is shown in Figure 3) as a tool to help quantify and compare the importance of different feedbacks in two contrasting regions of the Southern Ocean. The Amundsen Sea, which has warm shelf waters, atmospheric conditions that are relatively warm with a high snowfall rate and a diminishing sea ice cover. And the Weddell Sea, which has cold saline shelf waters, cold and dry atmospheric conditions and an expanding sea ice cover.

We have carried out simulations where we denied different feedbacks in combination with perturbing the surface air temperatures, and compared the results with simulations where the feedback is enabled, and can to respond to the surface air temperature perturbation. We found that in the Weddell Sea the feedback responses were generally smaller than the response of the ice cover to the surface air temperature. However in the Amundsen Sea, we found that the ice cover was very sensitive to the depth of the ocean mixed layer which determines the size of the ocean heat flux under the ice. Whenever the atmosphere warmed we found that the ocean heat flux to the ice decreased (due to a shallower mixed layer), and this acted against the atmospheric changes, buffering changes in the ice volume.

Using a simple model has made it easier to understand the different processes at play in the two regions. However, in order to try to better to understand how these feedbacks link back to the regional trends we will also need to consider spatial variability, which may act to change the importance of some of the feedbacks. Incorporating what we have learnt using the 1D model, we are now working on investigating some of the same processes using the CICE sea ice model, to explore the importance and impact of spatial variability on the feedbacks.

References

Turner et al. (2013), An initial assessment of Antarctic sea ice extent in the CMIP5 models, J. Climate, 26, 1473-1484, doi:10.1175/JCLI-D-12-00068.1

Holland, P. R. (2014), The seasonality of Antarctic sea ice trends, Geophys. Res. Lett., 41, 4230–4237, doi:10.1002/2014GL060172.

Petty et al. (2013), Impact of Atmospheric Forcing on Antarctic Continental Shelf Waters, J. Phys. Ocean., 43, 920-940, doi: 10.1175/JPO-D-12-0172.1

## Atmospheric blocking: why is it so hard to predict?

Atmospheric blocks are nearly stationary large-scale flow features that effectively block the prevailing westerly winds and redirect mobile cyclones. They are typically characterised by a synoptic-scale, quasi-stationary high pressure system in the midlatitudes that can remain over a region for several weeks. Blocking events can cause extreme weather: heat waves in summer and cold spells in winter, and the impacts associated with these events can escalate due to a block’s persistence. Because of this, it is important that we can forecast blocking accurately. However, atmospheric blocking has been shown to be the cause of some of the poorest forecasts in recent years. Looking at all occasions when the ECMWF model experienced a period of very low forecast skill, Rodwell et al. (2013) found that the average flow pattern for which these forecasts verified was an easily-distinguishable atmospheric blocking pattern (Figure 1). But why are blocks so hard to forecast?

There are several reasons why forecasting blocking is a challenge. Firstly, there is no universally accepted definition of what constitutes a block. Several different flow configurations that could be referred to as blocks are shown in Figure 2. The variety in flow patterns used to define blocking brings with it a variety of mechanisms that are dynamically important for blocks developing in a forecast (Woollings et al. 2018). Firstly, many phenomena must be well represented in a model for it to forecast all blocking events accurately. Secondly, there is no complete dynamical theory for block onset and maintenance- we do not know if a process key for blocking dynamics is missing from the equation set solved by numerical weather prediction models and is contributing to the forecast error. Finally, many of the known mechanisms associated with block onset and maintenance are also know sources of model uncertainty. For example, diabatic processes within extratropical cyclones have been shown to contribute substantially to blocking events (Pfahl et al. 2015), the parameterisation of which has been shown to affect medium-range forecasts of ridge building events (Martínez-Alvarado et al. 2015).

We do, however, know some ways to improve the representation of blocking: increase the horizontal resolution of the model (Schiemann et al. 2017); improve the parameterisation of subgrid physical processes (Jung et al. 2010); remove underlying model biases (Scaife et al. 2010); and in my PhD we found that improvements to a model’s dynamical core (the part of the model used to solved the governing equations) can also improve the medium-range forecast of blocking. In Figure 3, the frequency of blocking that occurred during two northern hemisphere winters is shown for the ERA-Interim reanalysis and three operational weather forecast centres (the ECMWF, Met Office (UKMO) and the Korean Meteorological Administration (KMA)). Both KMA and UKMO use the Met Office Unified Model – however, before the winter of 2014/15 the UKMO updated the model to use a new dynamical core whilst KMA continued to use the original. This means that for the 2013/14 the UKMO and KMA forecasts are from the same model with the same dynamical core whilst for the 2014/15 winter the UKMO and KMA forecasts are from the same model but with different dynamical cores. The clear improvement in forecast from the UKMO in 2014/15 can hence be attributed to the new dynamical core. For a full analysis of this improvement see Martínez-Alvarado et al. (2018).

In the remainder of my PhD I aim to investigate the link between errors in forecasts of blocking with the representation of upstream cyclones. I am particularly interested to see if the parameterisation of diabatic processes (a known source of model uncertainty) could be causing the downstream error in Rossby wave amplification and blocking.

References:

Rodwell, M. J., and Coauthors, 2013: Characteristics of occasional poor medium-range weather  forecasts for Europe. Bulletin of the American Meteorological Society, 94 (9), 1393–1405.

Woollings, T., and Coauthors, 2018: Blocking and its response to climate change. Current Climate Change Reports, 4 (3), 287–300.

Pfahl, S., C. Schwierz, M. Croci-Maspoli, C. Grams, and H. Wernli, 2015: Importance of latent  heat release in ascending air streams for atmospheric blocking. Nature Geoscience, 8 (8), 610– 614.

Mart´ınez-Alvarado, O., E. Madonna, S. Gray, and H. Joos, 2015: A route to systematic error in forecasts of Rossby waves. Quart. J. Roy. Meteor. Soc., 142, 196–210.

Mart´ınez-Alvarado, O., and R. Plant, 2014: Parametrized diabatic processes in numerical simulations of an extratropical cyclone. Quart. J. Roy. Meteor. Soc., 140 (682), 1742–1755.

Scaife, A. A., T. Woollings, J. Knight, G. Martin, and T. Hinton, 2010: Atmospheric blocking and mean biases in climate models. Journal of Climate, 23 (23), 6143–6152.

Schiemann, R., and Coauthors, 2017: The resolution sensitivity of northern hemisphere blocking in four 25-km atmospheric global circulation models. Journal of Climate, 30 (1), 337–358.

Jung, T., and Coauthors, 2010: The ECMWF model climate: Recent progress through improved physical parametrizations. Quart. J. Roy. Meteor. Soc., 136 (650), 1145–1160.

## International Conferences on Subseasonal to Decadal Prediction

I was recently fortunate enough to attend the International Conferences on Subseasonal to Decadal Prediction in Boulder, Colorado. This was a week-long event organised by the World Climate Research Programme (WCRP) and was a joint meeting with two conferences taking place simultaneously: the Second International Conference on Subseasonal to Seasonal Prediction (S2S) and the Second International Conference on Seasonal to Decadal Prediction (S2D). There were also joint sessions addressing common issues surrounding prediction on these timescales.

Weather and climate variations on subseasonal to seasonal (from around 2 weeks to a season) to decadal timescales can have enormous social, economic, and environmental impacts, making skillful predictions on these timescales a valuable tool for policymakers. As a result, there is an increasingly large interest within the scientific and operational forecasting communities in developing forecasts to improve our ability to predict severe weather events. On S2S timescales, these include high-impact meteorological events such as tropical cyclones, floods, droughts, and heat and cold waves. On S2D timescales, while the focus broadly remains on similar events (such as precipitation and surface temperatures), deciphering the roles of internal and externally-forced variability in forecasts also becomes important.

The conferences were attended by nearly 350 people, of which 92 were Early Career Scientists (either current PhD students or those who completed their PhD within the last 5-7 years), from 38 different countries. There were both oral and poster presentations on a wide variety of topics, including mechanisms of S2S and S2D predictability (e.g. the stratosphere and tropical-extratropical teleconnections) and current modelling issues in S2S and S2D prediction. I was fortunate to be able to give an oral presentation about some of my recently published work, in which we examine the performance of the ECMWF seasonal forecast model at representing a teleconnection mechanism which links Indian monsoon precipitation to weather and climate variations across the Northern Hemisphere. After my talk I spoke to several other people who are working on similar topics, which was very beneficial and helped give me ideas for analysis that I could carry out as part of my own research.

One of the best things about attending an international conference is the networking opportunities that it presents, both with people you already know and with potential future collaborators from other institutions. This conference was no exception, and as well as lunch and coffee breaks there was an Early Career Scientists evening meal. This gave me a chance to meet scientists from all over the world who are at a similar stage of their career to myself.

Boulder is located at the foot of the Rocky Mountains, so after the conference I took the opportunity to do some hiking on a few of the many trails that lead out from the city. I also took a trip up to NCAR’s Mesa Lab, which is located up the hillside away from the city and has spectacular views across Boulder and the high plains of Colorado, as well as a visitor centre with meteorological exhibits. It was a great experience to attend this conference and I am very grateful to NERC and the SummerTIME project for funding my travel and accommodation.

## Advice for students starting their PhD

The Meteorology Department at Reading has just welcomed its new cohort of PhD students, so we gathered some pearls of wisdom for the years ahead:

“Start good habits from the beginning; decide how you will make notes on papers, and how you will save papers, and where you will write notes, and how you will save files. Create a spreadsheet of where your code is, and what it does, and what figure it creates. It will save you so much time.”

“Write down everything you do in note form; this helps you a) track back your progress if you take time out from research and b) makes writing your thesis a LOT easier…”

“Pick your supervisor carefully. Don’t kid yourself that they will be different as a PhD supervisor; look for someone understanding and supportive.”

“Expect the work to progress slowly at first, things will not all work out simply.”

“Don’t give up! And don’t be afraid to ask for help from other PhDs or other staff members (in addition to your supervisors).”

“Don’t compare yourself to other PhDs, and make sure to take some time off, you’re allowed a holiday!”

“Ask for help all the time.”

“Keep a diary of the work you do each day so you remember exactly what you’ve done 6 months later.”

“Don’t worry if your supervisors/people in years above seem to know everything, or can do things really easily. There hasn’t been an administrative cock-up, you’re not an impostor: everyone’s been there. Also, get into a good routine. It really helps.”

“Talk to your supervisor about what both of you expect and decide how often you want to meet at the beginning. This will make things easier.”

“Don’t compare with other students. A PhD is an individual project with its own aims and targets. Everyone will get to completion on their own journey.”

“You’ll be lost but achieving something. You can’t see it yet.”

## The Many Speak Of Computer

Knowing multiple languages can be hard. As any polyglot will tell you, there are many difficulties that can come from mixing and matching languages; losing vocabulary in both, only being able to think and speak in one at a time, having to remember to apply the correct spelling and pronunciation conventions in the correct contexts.

Humans aren’t the only ones who experience these types of multiple-language issues, however. Computers can also suffer from linguistic problems pertaining to the “programming languages” humans use to communicate with them, as well as the more hidden, arcane languages they use to speak to one another. This can cause untold frustration to their users. Dealing with seemingly arbitrary computing issues while doing science, we humans, especially if we aren’t computing experts, can get stuck in a mire with no conceivable way out.

Problems with programming languages are the easiest problems of this nature to solve. Often the mistake is the human in question lacking the necessary vocabulary, or syntax, and the problem can be solved with a quick peruse of google or stack exchange to find someone with a solution. Humans are much better at communicating and expressing ideas in native human languages than computational ones. They often encounter the same problems as one another and describe them in similar ways. It is not uncommon to overhear a programmer lamenting: “But I know what I mean!” So looking for another human to act as a ‘translator’ can be very effective.

Otherwise, it’s a problem with the programming language itself; the language’s syntax is poorly defined, or doesn’t include provision for certain concepts or ideas to be expressed. Imagine trying to describe the taste of a lemon in a language which doesn’t possess words for ‘bitter’ or ‘sour’. At best these problems can be solved by installing some kind of library, or package, where someone else has written a work-around and you can piggy-back off of that effort. Like learning vocabulary from a new dialect. At worst you have to write these things yourself, and if you’re kind, and write good code, you will share them with the community; you’re telling the people down the pub that you’ve decided that the taste of lemons, fizzy colas, and Flanders red is “sour”.

There is, however, a more insidious and undermining class of problems, pertaining to the aforementioned arcane computer-only languages. These languages, more aptly called “machine code”, are the incomprehensible languages computers and different parts of a computer use to communicate with one another.

For many programming languages known as “compiled languages”, the computer must ‘compile’ the code written by a human into machine code which it then executes, running the program. This is generally a good thing; it helps debug errors before potentially disastrous code is run on a machine, it significantly improves performance as computers don’t need to translate code on the fly line-by-line. But there is a catch.

There is no one single machine code. And unless a computer both knows the language an executable is written in, and is able to speak it, then tough tomatoes, it can’t run that code.

This is fine for code you have written and compiled yourself, but when importing code from elsewhere it can cause tough to diagnose problems. Especially on the large computational infrastructures used in scientific computing, with many computers that might not all speak the same languages. In a discipline like meteorology, with a large legacy codebase, and where use of certain libraries is assumed, not knowing how to execute pre-compiled code will leave the hopeful researcher in a rut. Especially in cases where access to the source code of a library is restricted due to it being a commercial product. You know there’s a dialect that has the words the computer needs to express itself, and you have a set of dictionaries, but you don’t know any of the languages and they’re all completely unintelligible; which dictionary do you use?

So what can you do? Attempt to find alternative codebases. Write them yourself. Often, however, we stand on the shoulders of giants, and having to do so would be prohibitive. Ask your institution’s computing team for help – but they don’t always know the answers.

There are solutions we can employ in our day to day coding practices that can help. Clear documentation when writing code, as well as maintaining clear style guides can make a world of difference in attempting to diagnose problems that are machine-related as opposed to code-related. Keeping a repository of functions and procedures for oneself, even if it is not shared with the community, can also be a boon. You can’t see that a language doesn’t have a word for a concept unless you own a dictionary. Sometimes, pulling apart the ‘black box’-like libraries and packages we acquire from the internet, or our supervisors, or other scientists, is important in verifying that code does what we expect it to.

At the end of the day, you are not expected to be an expert in machine architecture. This is one of the many reasons why it is important to be nice to your academic computing team. If you experience issues of compilers not working on your institution’s computers, or executables of libraries not running it isn’t your job to fix it and you shouldn’t feel bad if it holds your project up. Read some papers, concentrate on some other work, work on your lit-review if you’re committed to continuing to do work. Personally, I took a holiday.

I have struggled with these problems and the solution has been to go to my PhD’s partner institution where we know the code works! Perhaps this is a sign that these problems can be extremely non-trivial, and are not to be underestimated.

Ahh well. It’s better than being a monoglot, at least.

## Evidence week, or why I chatted to politicians about evidence.

On a sunny Tuesday morning at 8.30 am I found myself passing through security to enter the Palace of Westminster. The home of the MPs and peers is not obvious territory for a PhD student. However, I was here as a Voice of Young Science (VoYS) volunteer for the Sense about Science Evidence WeekSense about Science in an independent charity that aims to scrutinise the use of evidence in the public domain and to challenge misleading or misrepresented science. I have written previously here about attending one of their workshops about peer review, and also here about contributing to a campaign aiming to assess the transparency of evidence used in government policy documents.

The purpose of evidence week was to bring together MPs, peers, parliamentary services and people from all walks of life to generate a conversation about why evidence in policy-making matters. The week was held in collaboration with the House of Commons Library, Parliamentary Office of Science and Technology and House of Commons Science and Technology Committee, in partnership with SAGE Publishing. Individual events and briefings were contributed to by further organisations including the Royal Statistical Society, Alliance for Useful Evidence and UCL. Each day had a different theme to focus on including ‘questioning quality’ and ‘wicked problems’ i.e. superficially simple problems which turn out to be complex and multifaceted.

Throughout the parliamentary week, which lasts from Monday to Thursday, Sense about Science had a stand in the Upper Waiting Hall of Parliament. This location is right outside committee rooms where members of the public will give evidence to one of the many select committees. These are collections of MPs from multiple parties whose role it is to oversee the work of government departments and agencies, though their role in gathering evidence and scrutiny can sometimes have significance beyond just UK policy-making (for example this story documenting one committee’s role in investigating the relationship between Facebook, Cambridge Analytica and the propagation of ‘fake news’). The aim of this stand was to catch the attention of both the public, parliamentary staff, and MPs, and to engage them in conversations about the importance of evidence. Alongside the stand, a series of events and briefings were held within Parliament on the topic of evidence. Titles included ‘making informed decisions about health care’ and ‘it ain’t necessarily so… simple stories can go wrong’.

Each day brought a new set of VoYS volunteers to the campaign, both to attend to the stand and to document and help out with the various events during the week. Hence I found myself abandoning my own research for a day to contribute to Day 2 of the campaign, focusing on navigating data and statistics. I had a busy day; beyond chatting to people at the stand I also took over the VoYS Twitter account to document some of the day’s key events, attended a briefing about the 2021 census, and provided a video roundup for the day (which can be viewed here!). For conversations that we had at the stand we were asked to particularly focus on questions in line with the theme of the day including ‘if a statistic is the answer, what was the question?’ and ‘where does this data come from?’

Trying to engage people at the stand proved to be challenging; the location of the stand meant people passing by were often in a rush to committee meetings. Occasionally the division bells, announcing a parliamentary vote, would also ring and a rush of MPs would flock by, great for trying to spot the more well-known MPs but less good for convincing them to stop to talk about data and statistics. In practice this meant I and other VoYS members had to adopt a very assertive approach in talking to people, a style that is generally not within the comfort zone of most scientists! However this did lead to some very interesting conversations, including with a paediatric surgeon who was advocating to the health select committee for increasing the investment in research to treat tumours in children. He posed a very interesting question: given a finite amount of funding for tumour research, how much of this should be specifically directed towards improving the survival outcomes of younger patients and how much to older patients? We also asked MPs and members of the public to add any evidence questions they had to the stand. A member of the public wondered, ‘are there incentives to show what doesn’t work?’ and Layla Moran, MP for Oxford West and Abingdon, asked ‘how can politicians better understand uncertainty in data?’

The week proved to be a success. Over 60 MPs from across parliamentary parties, including government ministers, interacted with some aspect of evidence week, accounting for around 10% of the total number of MPs. Also, a wider audience who engaged with the stand included parliamentary staff and members of the public. Sense about Science highlighted two outcomes after the event: one was the opening event where members of various community groups met with over 40 MPs and peers and had the opportunity to explain why evidence was important to them, whether their interest was in beekeeping, safe standing at football matches or IVF treatment; the second was the concluding round-table event regarding what people require from evidence gathering. SAGE will publish an overview of this round-table as a discussion paper in Autumn.

On a personal level, I had a very valuable experience. Firstly, it was great opportunity to visit somewhere as imposing and important as the Houses of Parliament and to contribute to such an exciting and innovate week. I was able to have some very interesting conversations with both MPs and members of the public. I found that generally everybody was enthusiastic about the need for increased use and transparency of evidence in policy-making. The challenge, instead, is to ensure that both policy-makers and the general public have the tools they need to collect, assess and apply evidence.

## How the Earth ‘breathes’ on a daily timescale

As the Earth rotates, each location on its surface is periodically exposed to incoming sunlight. For example, over London at the beginning of September, the intensity of incoming sunlight ranges from zero overnight, when the sun is below the horizon, to almost 1000 W m–2 at noon, when the sun is highest in the sky (Fig. 1).

Earth’s atmosphere and surface respond to this repeating daily cycle of incoming sunlight in ways that can change the amount of energy that is emitted or reflected back to space. For example, the increased amount of sunlight in the afternoon can heat up the surface and cause more thermal energy to be emitted to space. Meanwhile, the surface heating can also cause the air near the surface to warm up and rise to form clouds that will, in turn, reflect sunlight back to space. The resulting daily cycle of the top-of-atmosphere outgoing energy flows is therefore intricate and represents one of the most fundamental cycles of our weather and climate. It is essential that we can properly represent the physical processes controlling this daily variability to obtain accurate weather and climate forecasts. However, the daily variability in Earth’s outgoing energy flows is not currently well observed across the entire globe, and current weather and climate models can struggle to reproduce realistic daily variability, highlighting a lack of understanding.

To improve understanding, dominant patterns of the daily cycle in outgoing energy flows are extracted from Met Office model output using a mathematical technique known as “principal component analysis”.

The daily cycle of reflected sunlight is found to be dominated by the height of the sun in the sky, or the “solar zenith” angle, because the atmosphere and surface are more reflective when the sun is low in the sky. There is a lesser importance from low-level clouds over the ocean, known as “marine stratocumulus” clouds, which burn off during the afternoon, reducing the amount of reflected sunlight, and tall and thick clouds, known as “deep convective” clouds, which develop later in the afternoon over land and increase the amount of reflected sunlight. On the other hand, the daily cycle of emitted thermal energy is dominated by surface heating, which increases the emitted energy at noon, but also by deep convective clouds that have very high and cold tops, reducing the emitted energy later in the afternoon. These dominant processes controlling the daily cycle of Earth’s outgoing energy flows and their relative importance (summarised in Fig. 2) have not been revealed previously at the global scale.

The physical processes discussed above are consistent with the daily cycle in other relevant model variables such as the surface temperature and cloud amount, further supporting the findings. Interestingly, a time lag is identified in the response of the emitted thermal energy to cloud variations, which is thought to be related to changes in the humidity of the upper atmosphere once the clouds evaporate.

The new results highlight an important gap in the current observing system, which can be utilized to evaluate and improve deficiencies in weather and climate models.

Gristey, J. J., Chiu, J. C., Gurney, R. J., Morcrette, C. J., Hill, P. G., Russell, J. E., and Brindley, H. E.: Insights into the diurnal cycle of global Earth outgoing radiation using a numerical weather prediction model, Atmos. Chem. Phys., 18, 5129-5145, https://doi.org/10.5194/acp-18-5129-2018, 2018.

## Estimating the Effects of Vertical Wind Shear on Orographic Gravity Wave Drag

Orographic gravity waves occur when air flows over mountains in stably stratified conditions. The flow of air creates a pressure imbalance across the mountain, so a force is exerted on the mountain in the same direction as the flow. An equal and opposite force is exerted back on the atmosphere, and this is gravity wave drag (GWD).

GWD must be parametrized in Global Circulation Models (GCMs), as it is important for large-scale flow. The first parametrization was formulated by Palmer et al. (1986) to reduce a systematic westerly bias. The current parametrization was formulated by Lott and Miller (1997) and partitions the calculation into 2 parts (see figure 1):

1. The mountain waves. This is calculated by averaging the wind, Brunt-Väisälä frequency and fluid density in a layer between 1 and 2 standard deviations of the subgrid-scale orography above the mean orography.
2. The blocked flow. This is based on an interpretation of the non-dimensional mountain height.

The parametrization does not include the effects of wind shear. Wind shear is a change in the wind with height and it alters the vertical wave length of gravity waves and so alters the drag. It has been shown (Teixeira et al., 2004; Teixeira and Miranda, 2006) that a uniform shear profile (i.e. a change in the magnitude of the wind with height) decreases the drag whereas a profile in which the wind turns with height increases the drag. This effect was seen by Miranda et al. (2009) to have the greatest impact over Antarctica, where drag enhancement was seen to occur all year with a peak of ~50% during JJA. Figure 2 shows this.

The aim of this work is to test the impact of the inclusion of shear effects on the parametrization. The first stage of this is to test the sensitivity of the shear correction to the height in the atmosphere at which the necessary derivatives are approximated. We carry out calculations using 2 different reference heights:

1. The top of the boundary layer (BLH). This allows us to avoid the effects of boundary layer turbulence, which are not important in this case as they are unrelated to the dynamics of mountain waves.
2. The middle of the layer between 1 and 2 standard deviations of the sub-grid scale orography (SDH). This is the nominal height used in previous studies and in the parametrization.

All figures shown below focus on Antarctica and are averaged over all JJAs for the decade 2006-2015. We are interested in Antarctica and the JJA season for the reasons highlighted above. All calculations are carried out using ERA-Interim reanalysis data.

We first consider the enhancement assuming axisymmetric orography. The advantage of this is that it considerably simplifies the correction due to terms related to the anisotropy becoming constant (see Teixeira et al, 2004). Figure 3 shows this correction calculated using both reference heights. We can see that the enhancement is greater when the SDH is used.

We now consider the enhancement using mountains with an elliptical horizontal cross-section. This is how the real orography is represented in the parametrization. Again, we see that the enhancement is greater when the SDH is used (figure 4).

It is interesting to note that at both heights the enhancement is greater when axisymmetric orography is used. This occurs because, in the case of elliptical mountains, the shear vector is predominantly aligned along the orography, resulting is weaker enhancement (see figure 5).

We also investigate the fraction of times at which the terms related to wind profile curvature (i.e. those containing second derivatives) dominate the drag correction. This tells us the fraction of time for which curvature matters for the drag. We see that second derivatives dominate over much of Antarctica for a high proportion of the time (see figure 6).

In summary, the main findings are as follows:

• The drag is quantitatively robust to changes in calculation height, with the geographical distribution, seasonality and sign essentially the same.
• The drag is considerably enhanced when the SDH is used rather than the BLH.
• Investigation of the relative magnitudes of terms containing first and second derivatives in the drag correction indicates that second derivatives (i.e. curvature terms) dominate in a large proportion of Antarctica for a large fraction of time. This leads to an average enhancement of the drag which is larger over shorter time intervals.
• Use of an axisymmetric orography profile causes considerable overestimation of the shear effects. This is due to the shear vector being predominantly aligned along the mountains in the case of the orography with an elliptical horizontal cross-section.

These results highlight the need to ‘tune’ the calculation by identifying the optimum height in the atmosphere at which to approximate the derivatives. This work is ongoing. We expect the optimum height to be that at which the shear has the greatest impact on the surface drag.

References:

Lott F. and Miller M., 1997, A new subgrid-scale orographic drag parametrization: Its formulation and testing, Quart. J. Roy. Meteor. Soc., 123: 101–127.

Miranda P., Martins J. and Teixeira M., 2009, Assessing wind profile effects on the global atmospheric torque, Quart. J. Roy. Meteor. Soc., 135: 807–814.

Teixeira M. and Miranda P., 2006, A linear model of gravity wave drag for hydrostatic sheared flow over elliptical mountains, Quart. J. Roy. Meteor. Soc., 132: 2439–2458.

Teixeira M., Miranda P. and Valente M., 2004, An analytical model of mountain wave drag for wind profiles with shear and curvature, J. Atmos. Sci., 61: 1040–1054.