Better coding skills and tooling enable faster, more useful results.
Daniel Ayers – firstname.lastname@example.org
This post presents a collection of resources and tips that have been most useful to me in the first 18 months I’ve been coding – when I arrived at Reading, my coding ability amounted to using excel formulas. These days, I spend a lot of time coding experiments that test how well machine learning algorithms can provide information on error growth in low-dimensional dynamical systems. This requires fairly heavy use of Scikit-learn, Tensorflow and Pandas. This post would have been optimally useful at the start of the year, but perhaps even the coding veterans will find something of use – or better, they can tell me about something I am yet to discover!
First Steps: a few useful references
- A byte of python. A useful and concise reference for the fundamentals.
- Python Crash Course, Eric Matthes (2019). Detailed, lots of examples, and covers a wider range of topics (including, for example, using git). There are many intro to Python books around; this one has certainly been useful to me.1 There are many good online resources for python, but it can be helpful initially to have a coherent guide in one place.
How did I do that last time…?
Tip: save snippets.
There are often small bits of code that contain key tricks that we use only occasionally. Sometimes it takes a bit of time reading forums or documentation to figure out these tricks. It’s a pain to have to do the legwork again to find the trick a second or third time. There were numerous occasions when I knew I’d worked out how to do something previously, and then spent precious minutes trawling through various bits of code and coursework to find the line where I’d done it. Then I found a better solution: I started saving snippets with an online note taking tool called Supernotes. Here’s an example:
I often find myself searching through my code snippets to remind myself of things.
Text editors, IDEs and plugins.
If you haven’t already, it might be worth trying some different options when it comes to your text editor or IDE. I’ve met many people who swear by PyCharm. Personally, I’ve been getting on well with Visual Studio Code (VS Code) for a year now.
Either way, I also recommend spending some time installing useful plugins as these can make your life easier. My recommendations for VS Code plugins are: Hungry Delete, Rainbow CSV, LaTeX Workshop, Bracket Pair Colorizer 2, Rewrap and Todo Tree.
Linters & formatters
Linters and formatters check your code for syntax errors or style errors. I use the Black formatter, and have it set to run every time I save my file. This seems to save a lot of time, and not only with formatting: it becomes more obvious when I have used incorrect syntax or made a typo. It also makes my code easier to read and look nicer. Here’s an example of Black in anger:
Some other options for linters and formatters include autopep, yapf and pylint.
Metadata for results
Data needs metadata in order to be understood. Does your workflow enable you to understand your data? I tend to work with toy models, so my current approach is to make a new directory for each version of my experiment code. This way I can make notes for each version of the experiment (usually in a markdown file). In other words, what not to do, is to run the code to generate results and then edit the code (excepting, of course, if your code has a bug). At a later stage you may want to understand how your results were calculated, and this cannot be done if you’ve changed the code file since the data was generated (unless you are a git wizard).
A bigger toolbox makes you a more powerful coder
Knowing about the right tool for the job can make life much easier.2 There are many excellent Python packages, and the more you explore, the more likely you’ll know of something that can help you. A good resource for the modules of the Python 3 standard library is Python Module of The Week. Some favourite packages of mine are Pandas (for processing data) and Seaborn (a wrapper on Matplotlib that enables quick and fancy plotting of data). Both are well worth the time spent learning to use them.
Some thoughts on Matplotlib
Frankly some of the most frustrating experiences in my early days with python was trying to plot things with Matplotlib. At times it seemed inanely tedious, and bizarrely difficult to achieve what I wanted given how capable a tool others made it seem. My tips for the uninitiated would be:
- Be a minimalist, never a perfectionist. I often managed to spend 80% of my time plotting trying to achieve one obscure change. Ask: Do I really need this bit of the plot to get my point across?
- Can you hack it, i.e. can you fix up the plot using something other than Matplotlib? For example, you might spend ages trying to tell Matplotlib to get some spacing right, when for your current purpose you could get the same result by editing the plot in word/pages in a few clicks.
- Be patient. I promise, it gets easier with time.
Object oriented programming
I’m curious to know how many of us in the meteorology department code with classes. In simple projects, it is possible to do without classes. That said, there’s a reason classes are a fundamental of modern programming: they enable more elegant and effective problem solving, code structure and testing. As Hans Petter Langtangen states in A Primer on Scientific Programming with Python, “classes often provide better solutions to programming problems.”
What’s more, if you understand classes and object- oriented programming concepts then understanding others’ code is much easier. For example, it can make Matplotlib’s documentation easier to understand and, in the worse caseworst case scenario, if you had to read the Matplotlib source code to understand what was going on under the hood, it will make much more sense if you know how classes work. As with Pandas, classes are worth the time buy in!
Have any suggestions or other useful resources for wannabe pythonistas? Please comment below or email me at email@example.com.