Review: Bite Size Bash – Wizard Zines

(Post theme: Hall of Fame (feat. will.i.am) by The Script)

As mentioned previously, I purchased the set of Wizard Zines via my annual training budget, and am working through them. Current reading: Bite Size Bash.

The first 20 pages surprised me – I don’t consider myself particularly knowledgeable in terms of bash (or similar shell scripting) but pretty much everything in the basics I’ve managed to somehow pick up over the last couple of decades, despite usually jumping pretty quickly into another tool whenever my needs get above simple. It is a great intro, though, and I would have benefited having this at hand years back.

On page 20, parameter expansion is introduced. Most of this was either new to me, or I think I’ve seen but certainly wouldn’t remember to use. I’m not sure how to make this stick, though – the syntax is so odd. Maybe I should just jump to this page next time I’m writing bash, or maybe I’ll remember that it can be done and that will be enough to look it up or remember. I do wonder if LLM will replace this sort of learned syntax, though.

After that, it was background processes and subshells, both of which were again surprisingly familiar. After that is trapping signals – I knew this could be done, but I’m pretty sure by the time I have something that needs to do this, I’ve already moved to another language, and I don’t see that changing. Errors and debugging were also mostly things I knew, except for ‘trap x DEBUG’, which seems pretty handy.

Overall, this zine surprised me: I expected to learn a lot, and instead recognised a lot. In some ways, this is great: I feel more confident that I do actually know the most useful core of shell scripting (even though certainly not the vast area beyond that). I’d definitely recommend it for shell scripting beginners (especially bash) but likely not for others.

Review: Become a SELECT Star – Wizard Zines

(Post theme: Call Off the Search by Katie Melua)

As part of my training allowance for 2024, I purchased all 14 of Julia Evan‘s zines. There are definitely ones I expect to learn from more than others, but having the pack appealed, and one of the things I really enjoy about her posts is finding little tidbits of information that I didn’t know, even though I would have said I know the topic fairly well in general.

First up: Become a SELECT Star, which, as you’d guess from the title, is all about the SELECT statement in SQL. The first dozen pages are a really good summary of the fundamentals: things that I gradually absorbed over the last 20 or so years, and probably would have been better off with having a summary like this rather than a bunch of more traditional reference material.

I really like how the examples work through the query in the logical order rather than the syntax order. It makes it much simpler to follow what’s happening.

On pages 14-16, it covers window functions, which I didn’t really know at all. I would normally just do this by getting a bunch of the data down into something like Python and calculating there. I’m a big fan of moving computation left where possible, so these seems super useful, and something I hope I remember (although I don’t do a huge amount of SQL these days).

The zine continues on to cover NULL and COALESCE, which we used a lot at SpamExperts, so those were pretty familiar, but again a good summary that I could have used many years back! It then covers CASE, and I don’t remember if I knew that or not, but also don’t think I can think of times I would have wanted it directly in SQL.

After a straightforward page on counting rows, it moves into subqueries. I’ve used these quite often (I’ve often had situations where subquery performance was better than joining, for various reasons). However, common table expressions (naming sub queries) is new to me, and look super useful.

It wraps up with a few more useful tips, most of which were things I already knew.

I’d definitely recommend this to anyone that’s just starting with SQL. If you’re a SQL expert, it’s likely not going to be useful, but if you’re a casual querier like myself, then there are good tips to pick up, and given how small and inexpensive the zine is, I’d highly recommend picking it up.

Top 5 Books That Should Be Movies

(Post theme: Storybook Love by Mark Knopfler and Willy DeVille)

More Filmspotting catch-up. Hard to resist a Top 5 list that involves books, obviously!

I’m using roughly the same criteria as Josh & Adam (and Kristen) – in particular: anything that already has a film (even if it’s terrible) is excluded, and it has to be a full-length book. The latter is particularly challenging: in my opinion, short stories & novellas make the best page-to-film adaptations. Full-length books that would translate well to the screen are almost always going to be better as TV series. I’ve also avoided cheating this by suggesting a trilogy or series of films, although some of these could obviously have sequels.

There’s quite a bit of recency bias here. I’ve read a lot of books, but find it difficult to think back over which ones would work particularly well as films. I skimmed over my Goodreads list, but only recent reads have (my) ratings, and it’s tricky relying on average ratings – plus until recently I didn’t add books that I read but didn’t own (library, friends, family, etc).

5. Pastwatch: The Redemption of Christopher Columbus by Orson Scott Card. I’m generally wary of time travel stories (but am a sucker for time loops) although there are some great film examples. Ender’s Game was a terrible adaptation, so there’s a lot of risk here. (I’m aware of Card’s politics and religion, and my disagreement with most of it doesn’t change my enjoyment of his writing). There are some elements to the story that would play very well today: particularly, the environmental aspect, and the reality of Columbus’s impact on the people already living in the Americas. However, the titular redemption might be an issue. Gina Prince-Bythewood to direct (or maybe Roseanne Liang?), introducing new actors for the main cast, with either Justin Baldoni or Chris Evans to star as Columbus.

4. The Lies of Locke Lamora by Scott Lynch. This would need a careful screenwriter, to emphasise the heist/con story and ensure the fantasy element is almost incidental (until it’s critical to the story). It would have to be a long film, but you could probably sacrifice some parts of the story that lead towards the sequels without losing what makes it a great story. Apparently Warner Bros. were working on an adaption, and after that fell through someone else was, but as far as I know there is nothing announced, so it’s up for grabs in this list. Ajay Friese as Locke, Ian Samuels to direct.

3. The 7½ Deaths of Evelyn Hardcastle by Stuart Turton (first published as The 7 Deaths of Evelyn Hardcastle). Apparently Netflix were going to make a series based on this, but gave up, so that frees it up for the list. I mentioned my love of time loops already, and this would be a great addition, mixed in with the Christie-esque aspects. I didn’t love the ending of the book so I feel this could be one of the rare examples of a film that’s better than the source material). Maybe Mark Ruffalo as the protagonist, but I don’t have a director in mind.

2. The Loneliest Girl in the Universe by Lauren James. I love the genre twist in the middle of this story, and it would translate well to the screen, although you’d need to ensure that marketing (especially trailers, which have become awful in recent years) didn’t ruin it. There’s a lot of written communication in the story, and not seeing one of the characters is important, but a good screenwriter could solve that. Lauren James fancast this herself, so it’s hard to argue for anyone else to star. To direct: Gary Ross (bringing elements from Pleasantville, The Hunger Games, and Ocean’s 8).

1. Beak of the Moon by Philip Temple. It’s one of my favourite stories and definitely my favourite NZ novel. It would make an amazing animated film, made entirely in Aotearoa with local talent. There are two huge challenges: firstly, it’s a long book – but I think a good screenwriter (keep Walsh and Jackson as far from this as possible!) could solve that and the imagery would do a lot of the work. More challenging is that all the main characters are male, and this is central to the story (and, in my limited understanding, to kea behaviour). I think you’d want relatively unknown actors to lead (Māori, in their 20s) although I could see cameos by Rhys Darby, Bret McKenzie and Jemaine Clement as the kākā, Taika Watiti as the kākāpō, Sam Neill as the albatross, and maybe Cliff Curtis as Glintamber. I have no idea who could direct – it should be someone local, and someone who can do animation well, which means someone new.

Honourable mentions (in no particular order):

The Crystal Cave by Mary Stewart. I’m not sure the world really needs another Arthurian film, even if it is based on one of the best adaptations. It’s also probably too long to make a good film and The Hollow Hills would really need to be made as well. I’d still love to see it done well, and I think we’re at a point where there enough people that can do historical fantasy on screen very well. This did have a TV series adaptation, which probably excludes it (I haven’t seen that).
Daughter of the Empire by Janny Wurts and Raymond E. Feist. Easily the best trilogy in the Riftwar Cycle, and Mara’s story, particularly in this first book, would easily be as compelling as Katniss, Tris, and Teresa were in the 2010s – and we’re due for a resurgence in great YA films (with less dystopia this time). However, I’m not sure you could do justice in a film, and it might have to be a TV series.
Project Hail Mary by Andy Weir. Better than The Martian, which was a decent film. However, this is apparently going into production in 2024.
Gideon the Ninth, by Tamsyn Muir. I think the fantasy/sci-fi mash-up, the LGBTQ+ aspects, the humour, and the action would all combine to make a great film. I wonder if it would be too confusing to tell in under 2 hours and that you’d need to make this a TV series instead. It also feels like it would really require also making the other books in the series (once they’re all finished) and some of those would be even more difficult, and that also disqualifies it from this list.
Faerie Tale by Raymond E. Feist. Problem #1: I remember loving this story and re-reading it several times, but I haven’t read it in over 10 years, so it’s hard to recollect enough details to evaluate whether it would translate to the screen well. Problem #2: I watch hardly any supernatural thriller films, so don’t have any idea what it would be competing with. My gut feeling is that it would be great, though.
Memory Man, by David Baldacci. This would clearly make a great film – it was only bumped off my top 5 because there are already a lot of great crime films and I’m not sure that this would add anything truly new.
Hunting Party by Elizabeth Moon. This would be a great action/sci-fi story (much of the best sci-fi on film is space opera). I don’t have any reason for not including this, except that I couldn’t figure out which of my 5 top I would bump to make this. It’s a clear #6.

End of an Era

(Post Theme: Lost the Breakup by Maisie Peters)

Last Friday (30th June, 2023), was my last day working for N-able after 18 years (a bit over 12 years of that for SpamExperts, then after the acquisition a bit under 4 years for SolarWinds MSP, then after the split the last almost two years for N-able).

I’m more able to write about things now, and have a lot of ideas of topics I’d like to cover, and hope to get to a bunch of those before starting my new job in September. For now, mostly for my own record, a brief summary of how it all ended.

Scaled Agile & the introduction of a Product Owner role

In 2020, SolarWinds decided to adopt the Scaled Agile framework (SAFe), under the name “SolarAgile”. The plan was to introduce this within the MSP side of the business first, and then extend it to “Core” (the traditional, large, part of SolarWinds) and “Cloud” (Pingdom, AppOptics, Loggly, Papertrail). I was part of the pilot group helping to shape this in the initial adoption, which was actually pretty great.

One aspect of this project was the introduction of a Product Owner role. There were a few POs across different products, but no real consistency about what they did, and most teams didn’t have a PO. For Mail Assure & SpamExperts, I handled the PO role in the initial pilot (one of the requirements was someone that was both extremely familiar with the product and also very experienced with Agile), but the intention was that I would move to PM after the pilot period.

By this time, the Mail Assure & SpamExperts engineering team had shrunk in size quite a bit (down to two teams), and wasn’t expected to grow again. Similarly, from a product perspective, the goal was high retention rather than a lot of growth (especially new logo growth). I argued that we did not need both a PO and a PM for a product of Mail’s size (a “practicality beats purity” type argument). However, TPTB went ahead and hired someone for the PO role anyway.

In some ways, the acquisition (August 2017) was the beginning of the end – particularly since SolarWinds and then N-able were both very focused on people in offices (covid disrupted this as it did everywhere, but even now there is strong encouragement to be back at an office at least part of the time). However, I feel like adding in the PO role to the Mail team was the real beginning of the end, because it was always clear to me that we were ‘overprovisioned’ in product people for the nature of the product.

Everything went well enough for almost a year – a lot of that time was our new PO learning the ropes while I learnt more about the parts of PM that I hadn’t done before, and tried to extend out our strategic planning.

Reorganisation and Product Owners transformed to Product Managers

In late 2021, after the split from SolarWinds, N-able had another one of its frequent leadership changes, with a new CTO followed not long after by a new VP of engineering and a new VP of product. This (mostly) spelt the end of SolarAgile, and the decision was made to completely remove the PO position, with the actual PO work being brought under the responsibility of the PMs.

Essentially overnight, all the POs were now either PMs or moved elsewhere within the company (e.g. into an engineering role) – many have since left N-able. This transition was handled very poorly, with the news of the change arriving to at least some of the POs at the same time it arrived to the company as a whole.

Most relevant to my journey, this meant that Mail Assure & SpamExperts now had two PMs, the former PO and me. I already felt like both a PO and a PM was too much ‘product’ for the team, and this obviously made it that much worse.

Again, everything went ok for some time – while we were both PMs, we did still tend to split up the work in similar ways as before, with me focusing on the higher level strategy and prioritisation and my fellow PM working on more of the operational aspects.

Interlude: Promotion and Pay

During 2022, I was promoted to Senior Product Manager (although as a contractor I technically didn’t have any title at all). This had a reasonable pay bump, which was particularly welcome in a year where inflation was suddenly something that was relevant again.

This was the third significant pay increase that I received in my SolarWinds & N-able journey. The first was after SpamExperts was acquired – this was basically adjusting to be closer to ‘market’ rates (SpamExperts being fairly small was generally at the lower end, although it had certainly improved and in the last 5 or so years there I had no complaints about how much I was paid), and also essentially for retention purposes (ensuring as many key people as possible stayed after the acquisition). The second was a couple of years later, after the former SpamExperts CEO & CTO had left, and I was very unhappy in my architecture role and with the direction of the product. This was again basically a retention play (which worked – I also got other changes, which helped, but for the most part I was willing to stick around for a while because of the comparatively high renumeration).

It was never made super explicit, although it came up in some conversations, but I feel that these actually ended up contributing to the end (of course, in the meantime, they contributed plenty to my financial security). If you looked at the cost of PMs for the Mail team, then I was the bulk of that, and my salary was somewhere in the ballpark of 2-3% of the product’s revenue. When I moved product (more on that below) this would have been even more noticeable, because the strong retention motivation was no longer there (or at least no longer visible to the people making the decisions)

This isn’t a complaint about being well paid for a few years, by any means. But I do think that it was one factor in the eventual end of my tenure.

Moving Products

Around August 2022, it was clear that people had looked at the budgets for Mail and decided that having two PMs was not justified. I don’t disagree with this (and indeed had argued it all along), although I think things would have played out pretty differently if we’d never had a PO at all (which is certainly not her fault, and is not to denigrate any of the excellent work she did).

Either I would need to move or the other PM would need to move. It was made clear to me that the preference was for me to move – as a senior PM who was well regarded by many in N-able across several products (although certainly not universally; you can’t please everyone), the expectation was that it would be simpler to put me in a new product and have the other Mail PM continue with Mail Assure and SpamExperts.

I didn’t like this plan. I did have, in my performance planning documentation, some statements around either joining a new product or building something new within the same product & team. However, those were in the 3-5 year range, and I was pretty clear about having some work that I really wanted to finish with Mail first.

(As an aside: I never really got the chance to properly implement a strategy as Mail PM. The first 12-24 months were taken up with (a) work on security as a result of the SolarWinds Hack, (b) work to adjust as a result of the SolarWinds/N-able split, and (c) finishing off and cleaning up after the previous PM, who really didn’t understand the product at all. After that, we were just starting to get underway with some new work, and then I was moved away).

However, it was clear to me that me moving was going to be best – it would hopefully turn out well for me (wrong, as I came to find out), and for the PM who would stay (somewhat correct), and for everyone else as well (unclear). So I accepted that this was going to happen, and was moved to Cloud User Hub, which was a product born of another acquisition (spinpanel), and had just a month or two earlier launched. More on how that was sold to me and what the reality was another time.

Another Reorganisation

The first couple of months on Cloud User Hub were rough (mostly because of the state of the product and the disastrous launch), but by the middle of October were improving and things were looking up.

At this point, N-able joined many other tech companies and laid off a large chunk (maybe 15%?) of employees, and did a bunch of restructuring as a result. I was kept on, but my immediate manager was gone, and the product was moved to a separate group as well, under the leadership of someone completely new to N-able.

At this point, looking back it feels like a foregone conclusion that this would be it. The product was in a huge mess, and although I worked on improving that and some progress was made, and although I joined after the mess was made, you couldn’t look at my work in Cloud User Hub and see anything like success. In addition, I was now reporting to someone (my manager’s manager in particular) who had no history with me at all, so there was no existing goodwill or understanding of quality work that I had done in the past.

Final 4 Months

On February 28th, I was told that I was no longer required and would have 4 months of notice, finishing up at the end of June.

The official reason was the challenges around location and time zone. To be fair, this was much harder in Cloud User Hub than it had been with the Mail team. The majority of the engineering team were in India (good overlap with NZ), some in the Netherlands (good overlap with NZ, long history of working with people in that time zone), and some in the UK (reasonable overlap with NZ, several years of working with people in that time zone. However, N-able has almost all of the senior management in the US, and combining the US (or Canada, where there are other teams) time zones with the Europe/Asia ones leaves no good times for working with NZ.

For all of the 18 years I was with SpamExperts, then SolarWinds, then N-able, I was extremely flexible around working times (anyone who worked with me would attest to this). Until Cloud User Hub, this was the occasional meeting during the night, and working around 4-6 hours in my evening (which suited me well for many years anyway). After moving to Cloud User Hub, I would regularly have meetings at 3 a.m., 4 a.m., and so on – at least weekly, generally multiple times a week. I made this work, but it wasn’t good for me (or for N-able, really).

Ironically, this was much worse in the December-February period (excepting the week I vanished in February thanks to Cyclone Gabrielle) than later, when there was less need to be meeting with senior leadership and more work with the actual teams, where time zones aligned well enough. Travel to do in-person work (to Edinburgh, for example) was expensive for someone in NZ, though (none of the engineers in India would be doing that travel either).

More to say about my adventures over the last 18 years, but that’s essentially how it all came to an end!

Reply: Designing Pythonic library APIs

(Post theme: Code Monkey by Jonathan Coulton)

Ben Hoyt has a great post on designing Pythonic library APIs (itself a written version of a talk he gave). I have some thoughts in response:

Style

I love PEP 20, the Zen of Python (I used to have a set of t-shirts I had made that had one koan on each), and I think it’s actually applicable more widely than just code (Python or otherwise). I certainly agree that following its guidelines is a great start.

Ben suggests following PEP 8 (the style guide), I would go further than that:

Assuming that there’s some sort of CI pipeline, that should include enforcement of a style (ideally auto-correcting to one). Black is the obvious choice here, and it’s (for the most part) following PEP 8, but the most important thing is to have a consistent style where a tool does all the work.
Shift work ‘left’ of the CI pipeline, and make it easy for contributors, by having your style format of choice part of pre-commit or some similar workflow, and have an appropriate configuration file for that in the source repository.
Follow PEP 257 for your docstrings (and obviously have good docstrings). IDEs sometimes shove in a bunch of text around return values/types and enforce conventions – I’m less bothered about those, and I think generally they (and the related recommendations in PEP 257) have been supplanted by type annotations in many cases. When other people are using the library, they’ll see these docstrings, and they’re probably also part of your automated reference documentation.
While on the topic of docstrings, put an example or two in them anywhere it makes sense, and use doctest to make sure that they stay correct.
Have a style for imports (and use isort or something similar) to automate/enforce that as well. I personally prefer sorting by (length, normalised case alphabetical) with groupings for, from top to bottom, the standard library, third-party libraries (with a blank line between each, ordered approximately by how well established they are), internal libraries, and then finally imports from within the same package. But again, it’s the consistency that matters most. (This one isn’t really about API design).

“Pythonic”

In addition to the items that Ben mentions, I think it’s important to design the API so that it works well with Python idioms. Exactly what this entails depends a lot on the specifics of the API, but for example:

Functions & methods should be designed so that they can easily be used with the functools module (Ben has an example of this).
Provide generators rather than returning a tuple or list when possible. These should work well with the itertools module, with yield from, etc.
Work well with the standard library logging module (but don’t be noisy when someone isn’t using it). The logging module is an example of an API in the standard library that is poorly designed (or perhaps just is not particularly Pythonic), in my opinion, but it’s the default choice for logging and utilised by tools like Sentry.
Context managers. For example, if your object connects to something (a file, over a network, etc) then have the cleanup done in a method called close() so that you can using contextlib.closing (but actually also provide your own __exit__ to handle this).
Where appropriate, make it easy to serialise data. This might include supporting pickling objects, but might also be to other formats (JSON, YAML, etc).

Async

The Python releases that I really like are the ones that focus on improving performance (sometimes this is CPython specific) and usability (like the improved tracebacks in 3.11), and the standard library. In my opinion, for the most part, the Python language itself does not need regular changes, and sometimes these can be at the detriment of some of the aspects of Python that make it great (like readability and ease of learning).

I’m not (yet?) a fan of the walrus operator or pattern matching, for example. I have mixed opinions about type annotations. However, one change to the language over the last decade that I feel is definitely worthwhile is the addition of async & await. It was possible to use coroutines in Python previously, and understanding how async works does add complexity to learning the language, but I feel it expands what can be easily done with the language, in the same way that you can use Python in a very object-orientated way, or a very functional way, and so on.

One catch with async & await is that they have a tendency to spread throughout your code. You can ‘collapse’ async code into a synchronous action by adding an event loop and waiting until everything is done, but for the most part if you’re using an async library then you’re probably building your entire app in an async fashion. It’s definitely much simpler to make use of a synchronous call inside of an async method than vice-versa.

There are libraries that have added async functionality after originally being synchronous (e.g. Django) but from what I’ve heard that has been complicated to get right and the API is less natural than otherwise (e.g. compare with FastAPI).

Whether or not a library should have predominately async methods, or both async and sync versions, or avoid async entirely depends a lot on what it’s doing and how it’s expected to be used. However, it’s definitely something to think a lot about in advance, rather than try to adjust mid-flight.

Exception Chaining

Ben has a good section on errors and exceptions. The only thing I would add is that you can and should explicitly chain exceptions so that it’s clearer to the caller what’s happening. I think Ben is referring to that here, but doesn’t call it out explicitly:

For example, if your library can raise ssl.SSLError when it’s calling an HTTP API, it’s probably best to catch that and re-raise as fishnchips.NetworkError.
Ben Hoyt

Instead of doing this:

try:
    ...
except ssl.SSLError:
    raise fishnchips.NetworkError()

You should do this to show that the SSLError was the “direct cause” of the NetworkError:

try:
    ...
except ssl.SSLError as e:
    raise fishnchips.NetworkError() from e

Related to errors & exceptions, you should make use of the warnings module and the various warning Exception classes.

In this section, Ben also says:

APIs should be designed so that it’s hard to make mistakes.

He doesn’t have this as one of the takeaways, but I would ‘promote’ it to one.

The standard library unittest package

Ben has unittest as an example of a poor standard library API. I agree with this, and I think it provides a good example of where API design can be challenging. The main problems with the unittest API (such as assertEqual(a, b) not being assert_equal(a, b) or a plain assert a == b) come from the API being an implementation of the xUnit API (originally SUnit and popularised by Java’s JUnit but extremely widely used).

The question here is how closely the Python xUnit implementation should match the implementations in other languages (or, put another way, whether the Python standard library should have an xUnit implementation versus a library that supports unit tests that isn’t necessarily xUnit). If you’re coming to Python’s unittest module from Java (and JUnit) then these are comfortingly familiar:

import static org.junit.jupiter.api.Assertions.assertEquals;

import example.util.Calculator;

import org.junit.jupiter.api.Test;

class MyFirstJUnitJupiterTests {

    private final Calculator calculator = new Calculator();

    @Test
    void addition() {
        assertEquals(2, calculator.add(1, 1));
    }

}

import unittest

# Probably a relative import in practice.
import example.util


class MyFirstUnittestTests(unittest.TestCase):
    def setup(self):
        self.calculator = example.util.Calculator()

    def test_addition(self):
        # In Python 2, this could have been assertEquals()
        self.assertEqual(2, self.calculator.add(1, 1))

There are lots of other examples of APIs like this, where some underlying functionality is being exposed in many different languages, or where similar functionality is being implemented in many different languages. Sometimes, you get both, like with the python-mysql library:

import contextlib

import MySQLdb

# Fairly Pythonic, and aligns with the DB API (PEP 249)
# The Connection object should have an __exit__ that closes.
with contextlib.closing(MySQLdb.connect(**[connection args])) as db:
    with db.cursor() as c:
        c.execute("SELECT col1, col2 FROM tbl")
        for col1, col2 in c.fetchall():
            pass

# Low level API that essentially exposes the MySQL C API.
import _mysql as mysql

conn = mysql.connect(**[connection args])
conn.query("SELECT col1, col2 FROM tbl")
result = conn.store_result()
for col1, col2 in result.fetch_row(maxrows=0):
    pass
conn.close()

In general, I believe it’s better to design your API to match the language, and copy the intentions and outcomes from the source, rather than try to completely match the API. This is one reason why pytest is superior to the standard library unittest.

The standard library csv package

Ben has the csv module as an example of a good API, and I generally agree, and I think it’s particularly so given how poorly defined the CSV format is, which makes working with CSV files much more challenging. The one nit I have is:

import csv

with open("some.csv", newline="") as f:
    reader = csv.reader(f)
    for row in reader:
        pass

I don’t love that you have to know/remember to open the file with newline=”” (in Python 2, you had to remember to open the file in binary mode). Most of the time it won’t make any difference, but when you have a CSV with a newline in a quoted field it’ll break if you don’t do this (I’d argue that if you have that sort of CSV you perhaps are using the wrong serialisation format, but often that’s out of your control).

It’s more Pythonic to create objects from files than filenames (more technically: pass something file-like that supports the file protocol rather than something that supports the string protocol). It does feel like passing a string to csv.reader is generally wrong (you’ll get each character in the string as a one column row), and the first argument to csv.reader can already be a list (or similar) or file (or similar), so perhaps a string could be taken to mean a filename. csv.reader_from_filename doesn’t seem Pythonic, or csv.DictReader.from_filename. Having csv.reader call reconfigure() on the passed object is probably a bit too magic (explicit is better than implicit!).

In summary, this is a (very small part) of the csv API that I don’t like, but I don’t have a good suggestion for solving it, either.

from library import something

Ben says:

Takeaway: Design your library to be used as import lib ... lib.Thing() rather than from lib import LibThing ... LibThing().
Ben Hoyt

I agree with the majority of his post but a huge 100% to this specifically. Namespaces are one honking great idea!

Global configuration and state

In this example code of Ben’s, he argues against having a module-level DEFAULT_TIMEOUT:

DEFAULT_TIMEOUT = 10

def order(..., timeout=None):
    if timeout is None:
        timeout = DEFAULT_TIMEOUT
    ...

This is an interesting argument, and goes against the common refrain (probably originating from languages like C) that you shouldn’t have ‘magic’ numbers in your code and should define them at the top level instead.

If your intention is that people should be able to change the default, then you should definitely do this differently (e.g. as Ben describes). If your intention is that this default is never to be changed, then you can make this a little better in modern Python (3.8+):

import typing

_DEFAULT_TIMEOUT: typing.Final[int] = 10

def order(..., timeout=_DEFAULT_TIMEOUT):
    ...

However, you’ll need some sort of type checker to validate that it’s actually not changed (which does go along with Python’s “consenting adult” type approach to things like private variables).

Type Annotations

Ben has a good section on type annotations, which pretty much exactly matches my feelings. I don’t love them, or feel that they are themselves really Pythonic (although their use in libraries like Pydantic does, somehow). I completely agree with Ben’s takeaway:

On balance, I definitely think it’s the right thing to do in 2023 to ship your library with type annotations.

And of course, don’t just use them, but run Pyright or MyPy over your library’s code on every commit.
Ben Hoyt

Ben’s last example of a positive of type annotations is:

They help your IDE provide better navigation and auto-completion.

I agree that this is the case, and one of the more compelling reasons to use type annotations, and also why there’s – to a certain extent – an obligation on library/package developers to provide them. However, I find it generally disappointing. I strongly feel that this functionality should be provided by the IDE without the user needing to put in all the manual work of explicitly typing everything. I wish we had solved this need with better tools rather than by putting a heap of additional work on developers – and especially without adding a heap of boilerplate to Python code. I understand that the dynamic nature of Python makes this hard, but hard problems are good ones to solve.

Overriding Operators

Ben gives a rule of thumb:

Only override math operators like a+b if you’re creating a number type.
Ben Hoyt

I agree with this for almost all math operators, except perhaps + (for example, using + to concatenate strings is more natural than str.join, and I think it was right to make that implementation faster rather than focus all energy on getting people to call join).

I think the standard library has a perfect example of a mistake here, which is pathlib and the division operator. I believe this is a cute hack that is maybe ok in a third-party library, but definitely does not belong in the standard library:

>>> import pathlib
>>> p = pathlib.Path("~")
>>> conf = p / ".mypackage" / "config.ini"
>>> str(conf)
'~/.mypackage/config.ini'

There was a lot of controversy about this back in 2012 (the PEP has a reference to some of it), and maybe the right decision was made, but it’s unPythonic and distasteful in my view.

Keyword arguments

Ben makes an argument for keyword arguments helping with backwards-compatibility, which I agree with. He has this example:

def order(chips=None, fish=None):
    """Place an order.

    Args:
        chips: number of scoops of chips
        fish: number of fish
    """

I would argue that these should be keyword-only arguments. If I’m writing a call to order, I’ll never remember whether fish or chips comes first (I would even argue that they are backwards here, because – at least where I am – people say “fish and chips”, not “chips and fish” – an irreversible binomial if you’re a linguist). An IDE might help out when writing, but when reading the code, you’re not necessarily going to have that context made available. A two character change, but it prevents easy mistakes:

def order(*, chips=None, fish=None):
    """Place an order.

    Args:
        chips: number of scoops of chips
        fish: number of fish
    """

I also worry a bit about how this would scale. A fish’n’chip shop probably has at least a dozen items on their menu, and that’s a lot of arguments. It could be generalised, something like:

def order(**kwargs):
    """Place an order.

    The arguments should be the name of the item, with the value: either the number of the item,
           a tuple of (quantity, type),
           or a list of such tuples

    For example:

    >>> fishnchips.order(chips=1, fritters=(4, 'potato'), fish=[(1, 'crumbed'), (1, 'battered')])
    """

The main concerns I have about this are:

If you want to have arguments that are not items in the order (like the timeout one Ben has as an example), it feels messy for that to be mixed in with the items. Using **kwargs helps, because you’d make timeout a keyword-only explicit argument and that would distinguish it, but it still feels untidy to mix order items and order configuration into what is essentially one dictionary.
The item names are limited to what’s supported by Python names. That means it can’t start with a number, can’t have spaces, can’t have punctuation, and so on. For the most part this is probably fine – you can have onion_rings instead of ‘onion rings’ and the like. It feels like it might get challenging to remember the rules for converting from ‘real name’ to ‘argument name’, though. I also suspect that eventually the method will need the ‘real name’ for logging or display or similar.
Related to the issue of remembering the rules is remembering what can be ordered. With Ben’s version, the signature tells you that you can have fish or chips. With my kwargs one, it does not – presumably there is some validation of the arguments in the function, but that doesn’t help the caller prior to runtime.
I’m not sure how well this would continue to scale. Say that after adding crumbed fish, we decide to sell both Snapper and Tarakihi. Do we now support an argument like fish=(3, ‘crumbed’, ‘snapper’)? How do I remember that it’s not fish=(3, ‘snapper’, ‘crumbed’)? How are we going to handle burgers, which have lots of ingredients that people might want to customise?

I realise Ben was using this as an example of how keyword args and dynamic typing help backwards compatibility (which they do), and not trying to design the ultimate fish&chips library API. However, I feel like you’d need to move to a single collection of objects (probably dataclasses) before long.

Experimenting with LLM coding help

(Post-theme: With a Little Help From My Friends by the Beatles)

LLM & generative AI buzz is everywhere, and assistance writing code (either via something like copilot or by just querying a model) is supposedly a huge productivity boost, so I was curious enough to try this out. Simon Willison has written a lot of interesting things about LLM including how he’s used it for developing, and made it seem like something that could be useful.

As a first experiment, I wanted to use the “ask the LLM to draft something for me” approach (rather than a IDE-integrated tool like copilot), and I wanted to avoid having to go through a ‘conversation’ of prompts (particularly one where I had to guide the model to a solution I already knew or guessed). It’s usually pretty easy to find what I need by looking at the docs (or code), so if this is going to be useful for this sort of work, then I need it to be simple and fast.

Details of the two attempts I’ve made so far are below, but in summary: I’m underwhelmed so far. Maybe I’m not trying the right sort of problems (one was real, one was an exercise), or maybe I need to be better at prompt crafting or more willing to spend time ‘talking’ with the model. I have some investigation/experimentation to do with production use of LLM so may try out some more personal use then.

Redirecting users to a custom landing page in Django

I have a Django site that’s about 13 years old, so has gone through a lot of Django versions. It has the ability to specify a landing page (ie. where you end up after logging in) per-user, if one isn’t specified in the params (ie. in the typical “next” arg). I noticed that at some point this had broken, and wanted to fix/re-implement it for modern Django.

I asked

How do I get Django to redirect a user after logging in to a page specified in a user model field?

The answer I was given is:

You can get Django to redirect a user after logging in to a page specified in a user model field by defining a custom redirect function in your `views.py` file and using it in your `urls.py` file.

In your `views.py` file, define a function that retrieves the desired redirect URL from the user model field and redirects the user to that URL after login:

```python
from django.contrib.auth.decorators import login_required
from django.shortcuts import redirect

@login_required
def custom_redirect(request):
    redirect_url = request.user.redirect_url # Replace 'redirect_url' with the name of your user model field
    return redirect(redirect_url)
```

In your `urls.py` file, map the custom redirect function to the Django login view:

```python
from django.contrib.auth.views import LoginView
from .views import custom_redirect

urlpatterns = [
    # Other URL patterns...
    path('login/', LoginView.as_view(redirect_authenticated_user=True, template_name='login.html'), name='login'),
    path('redirect/', custom_redirect, name='custom_redirect'),
]
```

Note that in the above code, we're using Django's built-in `LoginView` and setting the `redirect_authenticated_user` parameter to `True` to automatically redirect authenticated users to the `custom_redirect` URL. We're also specifying the path to the login template using the `template_name` parameter.

Finally, update your user model to include the `redirect_url` field:

```python
from django.db import models
from django.contrib.auth.models import AbstractUser

class CustomUser(AbstractUser):
    redirect_url = models.CharField(max_length=255, blank=True, null=True)
```

Now when a user logs in, they will be redirected to the URL specified in the `redirect_url` field of their user model.

The immediately obvious problem here is that this doesn’t work. Nothing gets the user to the /redirect/ path, so although that would work to redirect to the user’s chosen page, it isn’t any part of the login flow. The second obvious problem is that the redirect_authenticated_user doesn’t do what the model described at all – it’s basically about bypassing the login page when someone is already logged in.

(Side note: while looking up links for this post, I came across this SO question about this (which is likely forming a lot of the LLM’s answer I suspect). It’s not that clear either, but I guess going to SO is what people typically did before LLM? Personally, my go-to is always the docs.)

I could make a solution like this work – if you set the LOGIN_REDIRECT_URL to be the redirect path and a few other adjustments. However, it’s messy and handling a ‘next’ style param won’t work properly without duplicating a bunch of code.

For reference, what I did (which I believe is the ‘correct’ approach, at least as-of the version of Django I’m using) is:

# Add this line to my URLs.
path("accounts/login/", [...].login.LoginUserRedirectView.as_view()),


# Add this class to a login module in my app.
class LoginUserRedirectView(LoginView):
    """Like the parent, but with a custom per-user redirect system."""
    def get_redirect_url(self):
        # It would probably be cleaner to just check for the Anonymous User.
        if not self.request.user or not hasattr(self.request.user, "last_login"):
            return super().get_redirect_url()
        # Having "next" in the request parameters is already handled by the normal login view and will bypass this view completely.
        redirect_to = self.request.POST.get(
            self.redirect_field_name, self.request.GET.get(self.redirect_field_name)
) or self.request.user.landing_page or ""
        url_is_safe = url_has_allowed_host_and_scheme(
url=redirect_to,
allowed_hosts=self.get_success_url_allowed_hosts(),
require_https=self.request.is_secure(),
)
        return redirect_to if url_is_safe else ""

The documentation on this (unusually for Django) isn’t great. However, the code is very clear, and makes it pretty obvious how to do the appropriate subclassing (as well as some safety checks).

Evaluating a basic maths expression in Python

The second one I tried was from an interview exercise. When I was talking to interviewers after completing the test, one of them mentioned that there was a lot of submissions that were clearly from LLM, which made me curious how it would have done. Using one during the exercise didn’t occur to me, but I doubt it would have appealed anyway, since it would consume a bunch of quite limited time and I wouldn’t have expected it to do particularly well.

The task was to take a string like “3 + 7 / 5 * 2” and give 5.8. No brackets, only integers, only addition, subtraction, multiplication, and division, respecting the precedence of operators.

I asked

Write a Python script that can add, subtract, divide, and multiply given an input string, and respects the order of operators, so multiplication and division are done before addition and subtraction.

The answer I was given is:

Here's an example Python script that can perform basic arithmetic operations while respecting the order of operations:

```
def calculate(expression):
    # Split the expression into a list of tokens
    tokens = expression.split()

    # Initialize the result to the first number in the expression
    result = float(tokens[0])

    # Loop through the remaining tokens and perform the appropriate operation
    for i in range(1, len(tokens), 2):
        operator = tokens[i]
        operand = float(tokens[i+1])

        if operator == '+':
            result += operand
        elif operator == '-':
            result -= operand
        elif operator == '*':
            result *= operand
        elif operator == '/':
            result /= operand

    return result
```

To use this script, simply call the `calculate()` function with a string containing the arithmetic expression you want to evaluate. For example:

```
expression = '5 + 3 * 2 - 4 / 2'
result = calculate(expression)
print(result)  # Output: 10.0
```

This is also very clearly wrong. For one thing, the example at the end has the wrong answer – if you ignore precedence then it would give 6 and if you do it correctly it would give 9 (I’m not sure how to get 10). Looking at the code, it also clearly does nothing at all to respect operator precedence.

More importantly, there’s no code that even vaguely hints at dealing with operator precedence. Also, converting to a float is risky with maths unless you really know the context and understand the implications.

For the record, I considered three approaches to this.

Eval()

The first was trivially simple – use eval(). I spent quite a bit of my allowed time internally debating whether the point of the exercise was to check whether I knew eval() existed and how to appropriately use it and that I wouldn’t pointlessly re-implement built-ins – or whether it was not meant to be used, even though that wasn’t in the (otherwise quite detailed) instructions. I put in a bunch of code to handle errors and additional safety restrictions, but at heart, this is just

return eval(input_string, {}, {})

Double-pass Item Replacement

The second was to do two passes of the input string (first for multiplication and division, and second for addition and subtraction), replacing chunks of the expression by the results. This is fairly straightforward, and my main concerns at the time were that this seems like it would get very messy as soon as you try to extend it to do anything else, and that tracking your location in the expression gets messy when you’re changing the length as you iterate through it.

Roughly, again ignoring error handling and so forth, and ignoring that this has a bunch of unneeded conversion to decimals, this is:

tokens = s.split()
i = 1
while i < len(tokens):
    if tokens[i] == "*" or tokens[i] == "/":
        left = decimal.Decimal(tokens[i - 1])
        op = tokens[i]
        right = decimal.Decimal(tokens[i + 1])
        if op == "*":
            tokens[i - 1:i + 2] = [left * right]
        else:  # op == "/"
            tokens[i - 1:i + 2] = [left / right]
    else:
        i += 2
i = 1
while len(tokens) > 1:
    if tokens[i] == "+" or tokens[i] == "-":
        left = decimal.Decimal(tokens[i - 1])
        op = tokens[i]
        right = decimal.Decimal(tokens[i + 1])
        if op == "+":
            tokens[i - 1:i + 2] = [left + right]
        else:  # op == "-"
            tokens[i - 1:i + 2] = [left - right]
    else:
        i += 2
return tokens[0]

In terms of an exercise, this shows an understanding of replacing a slice of a list in Python, and the performance isn’t terrible (memory is fine, looping twice is ok). There’s a bunch of tidying up that could be done, but it would probably have sufficed. I don’t like that it would get complicated quite quickly if you expanded it – adding parentheses and brackets, for example. It is better than eval, unless the point is to show that you shouldn’t reimplement something unnecessarily.

Convert to Post-fix Notation, Evaluate

The third approach, which is the one I like most, was to convert the expression to RPN and then evaluate the RPN expression. I could remember that RPN evaluation was trivial (from way, way, back in my student days) and also that it was fairly simple to convert from in-fix to post-fix (also something I vaguely remember doing, probably in a Data Structures & Algorithms course ~25 years ago, probably in C++). I remembered that Dijkstra had an algorithm for this and the name had something to do with railways, but not the exact details (I looked it up for this: the Shunting Yard algorithm), and that it was basically just using two lists (I would have had to look up the exact details or play around with this, but for the simple case of just addition, multiplication, subtraction, and division, I could remember enough).

Roughly (without extensive testing or any of the error checking, etc), this is:

# Convert from in-fix to post-fix.
precedence = {"+": 2, "-": 2, "*": 3, "/": 3}
tokens = s.split()
op_stack = []
output_queue = []
for token in tokens:
    if token in "+-*/":
        while op_stack and precedence[op_stack[-1]] >= precedence[token]:
            output_queue.append(op_stack.pop())
        op_stack.append(token)
    else:  # Assume a number for now, ie. we're ignoring parentheses and other things.
        output_queue.append(decimal.Decimal(token))
while op_stack:
    output_queue.append(op_stack.pop())
# Evaluate the post-fix expression.
funcs = {"+": "__add__", "-": "__sub__", "*": "__mul__", "/": "__truediv__"}
i = 0  # A for loop would work here, but I hate changing a list while iterating through it.
while len(output_queue) > 1:
    if output_queue[i] not in funcs:
        i += 1  # Skip this, we'll grab it when we get an operator.
        continue
    i -= 2  # Go backwards to get the two operands.
    left = output_queue.pop(i)
    right = output_queue.pop(i)
    op = output_queue.pop(i)
    output_queue.insert(i, getattr(left, funcs[op])(right))  # This probably tries to be too clever and a simple 4-clause if statement would be fine.
return output_queue[0]

In terms of the exercise, this beats eval (again assuming the point isn’t avoiding pointless work) and I feel it beats the previous version, since it’s more readily extendable. Using the magic methods rather than just an if statement is probably unnecessary, but shows an understanding that they exist, that you should call them against the object, and shows an understanding of using functions as first-class objects (having them in the dictionary). It’s more expensive in terms of memory – it could be improved a bit, but generally it’s creating new containers not adjusting the existing one, so will always be worse. The code could do with a bunch of cleanup, especially the RPN evaluation, but it suffices for something done in a quick exercise.

Performance-wise, with a few trivial test statements (not trying to do this seriously at all):

$ python -m timeit -s "import eval_math as m" "m.test_eval()"
10000 loops, best of 5: 21.8 usec per loop
$ python -m timeit -s "import eval_math as m" "m.test_double_loop()"
20000 loops, best of 5: 10.6 usec per loop
$ python -m timeit -s "import eval_math as m" "m.test_rpn()"
10000 loops, best of 5: 24.5 usec per loop

There are, of course, many other ways to do this.

Conclusion

I’m sure that LLMs can assist with coding, and make me more efficient. I don’t feel I have figured out the way to make that happen, yet. More experimenting to do in the future, I suppose.

Top 5 MCU Villains

(More Filmspotting back episode catching up, although not so far back in time now. Post theme music: Escape (The Pina Colada Song) by Rupert Holmes).

Josh and Adam did this as a draft, and I can’t really do that since I don’t know how my choices would have impacted their choices – although I would have got first pick, guessing Killmonger had 1000 kills (low but much closer than they guessed). So, just a regular top 5.

Similar criteria as they used: this isn’t the most powerful villain – it’s the ones that are the most interesting, and that make their film(s) more enjoyable and compelling.

5. Quentin Beck (from Spider-Man: Far From Home) – it’s done regularly in other films, but generally it feels like the MCU doesn’t do films where the villain starts out (appearing) as a hero and then that turns around (although they did it very poorly with Wanda). Good backstory and no real super-powers (although crazy good tech).

4. Ultron – not my favourite Avengers film, but Ultron as a character is really interesting. More humour than a lot of the villains, reasonable backstory.

3. Loki – a tricky choice, since he’s not really a villain any more, but certainly was originally. So much fun, compelling motivation, great dynamics with everyone else. It’s obvious why he kept coming back.

2. Killmonger – for all the reasons that Adam and Josh outlined. Pretty compelling backstory, really flawed.

1. Helmut Zemo (from Captain America Civil War) – compelling motivation, and achieves so much without having any kind of super powers or vast wealth.

Honourable mentions: like Filmspotting, I only included films, but if the TV series were included, then Kilgrave (from Jessica Jones) would definitely be in my top 2 (Salinger and Trish are also great). Fisk (from Daredevil) would probably make my list as well. Ward (from Agents of S.H.I.E.L.D.) would be in the running as well, especially from the earlier seasons. Arthur Harrow (from Moon Knight) would be one I’d have to consider too.

Top 5 Power Chris’s

(Still catching up on back episodes of Filmspotting – from last year, not when they did it in 2018. Post theme: Lady in Red by Chris de Burgh).

5. Chris Pratt – I do like the Guardians movies (less so the second one), but I’m not a big fan of the Jurassic World trilogy. The LEGO Movie and Passengers were good, but nothing else stands out to me.

4. Chris O’Dowd – using the same “5th Chris” as Josh & Adam did. Good in the MCU, but the other things I’ve loved him in are TV, which seems a cheat here (The Big Door Prize, The I.T. Crowd). I did like Juliet, Naked.

3. Chris Hemsworth – really good as Thor (except the latest one). Mixed on Extraction, Men in Black, Huntsman. Disliked Ghostbusters. Wrong era in Home and Away for me to know him from that. A few blind spots here.

2. Chris Pine – the Star Treks were so-so. I liked the first Wonder Woman, but did not like the second. I’ve heard good things about the new D&D movie, but haven’t managed to see it yet. Spider-Verse is one of my favourite movies, but he’s not a big part of that. A lot of other films that are blind spots for me.

1. Chris Evans – easily the best MCU Chris (not just the character, but in the biggest group of the better films). Ghosted was fun, The Grey Man was ok, Lightyear was an odd movie, but I like him in it, hated Don’t Look Up, enjoyed the small cameo in Free Guy a lot, liked Knives Out, liked Gifted, Fantastic 4 was not great.

Review: The Map of Tiny Perfect Things

(A while back The Spinoff asked for suggestions for best book-to-film adaptations, and this was my suggestion. Post theme song: If You Think This Is Real Life by Blossums).

The Map of Tiny Perfect Things (2021, streaming on Prime Video) is based on a short story (of the same name) in the anthology book Summer Days & Summer Nights (edited by Stephanie Perkins). In general, I find short stories produce better film adaptations than novels do.

It’s a great story – a modern (and much better) version of Groundhog Day. It’s romantic, deeper than you first think, and has fun playing with time-loop tropes. (Mild spoiler, sorry) it also has a truly wonderful moment when both the narrator and the reader realise that the story isn’t actually about him at all.

The film takes all of this and is true to all the important parts, but also improves on some of the weaker parts of the story. Kathryn Newton (Little Big Lies, Supernatural) and Kyle Allen (West Side Story, The Path) bring their characters to life perfectly. Director Ian Samuels’s style is clearly there, but it’s not as odd as Myrna the Monster and a stronger story than Sierra Burgess.

The author, Lev Grossman, has other adaptations (e.g. The Magicians) but both source material and adaption aren’t as good as The Map of Tiny Perfect Things.

Anyway, if you know the film I assume you also love it. If you know the story, I strongly recommend the film. If you haven’t read the book, it’s worth it just for this story, but some of the others in the anthology are ok as well, particularly if you’re looking for a light, romantic, read. If you don’t know the film or the book, definitely add it to your watch list.

Too much spin at the spinoff

l2jdutyfmoutintqo

I try to bring some facts to a very poor post at a site that I typically very much enjoy reading and supporting.

I very strongly believe that nearly all teachers have the best interests of students at heart, and that the same is true for nearly all employees of the Ministry of Education. I also believe that the various Ministers of Education are, at minimum, not opposed to improving outcomes for students.

To the detriment of our children, these groups cannot get along. All of them have made poor decisions in the past, and none of them seem able to get past it. It’s far past time for this to be remedied, so that everyone is able to agreeably work together to improve education in New Zealand. Teachers need to realise that the Ministry isn’t always out to get them, and the Ministry needs to realise that teachers have much to contribute to guiding how education is shaped.

Communication is particularly a problem. When honestly examined, much of what is presented by teacher unions is propaganda, often filled with misunderstanding, half-truths, and paranoia. On the other side, the Ministry’s messaging is so bad that many people believe that this is deliberately so – to distract from some dastardly plan. Personally, I think that Hanlon’s razor applies.

(It’s ironic that Communities of Learning are designed to encourage collaboration between schools, when nearly all schools were already engaged in such collaboration, and it’s collaboration between the policy section of the Ministry and schools that is lacking).

First, some things that this post is not about, to get them out of the way:

It’s incredibly difficult to run a school on the funding that they get from the Ministry; schools are extremely dependant on donations (both time and money) from parents and other fundraising, and doing more than the bare minimum is nearly impossible relying solely on a school’s operations grant.

I do believe that there are improvements that could be made in how the funding that is available is distributed. I don’t pretend to know what the answers are.

There are seven (3 main, 4 supplementary) proposed changes to how education is funded, and the “global budget” is only one of those (and only a supplementary one). I do have concerns about some of the others, and it’s not entirely clear what the impact of them all will be. The Ministry is also pushing a very large amount of change in a very short period of time, which I don’t think is ever wise.

A rough guide to how teaching staffing currently works in NZ (for state, non-Partnership, schools)

Each school is allocated an amount of staffing, based on a formula that’s mostly to do with the number of students expected to be attending the school as of July 1st, which is expressed in full-time teacher equivalents (FTTEs).

Schools are permitted to “bank” staffing. This allows flexibility in when during the year the staffing entitlement is used – for example, if you have an entitlement of 5.5 full-time staff, you could have 5 full-time staff and one half-time staff member for the whole year, or 5 full-time staff for half the year, and 6 full-time staff for the other half.

Schools may also elect to not employ as many (equivalent full time) staff as they are entitled to, and get a refund of actual cash, or (far more commonly) employ more staff than they are entitled to, and pay for those staff from their other funds. There are limits around how much can be “banked”, and you may only overuse or underuse by 10% of your entitlement in a year.

In the school’s accounts, there’s a total amount of money that was spent on the teaching staff (for example, this was $358,570 at Ahuroa School in 2015), and a grant of exactly the same amount of money from the Ministry. Although this shows in the accounts, the school never actually comes into contact with this money, other than with staffing over or under use. For banked staffing, if you have underused you are paid out at $54,500 per FTTE, but if you have overused, you need to pay back at $68,500 per FTTE (2016 rates).

For example, if you’ve been allocated 3.9 staff members, but you decide to top that up to 4.0, then you’ll need to pay from your other funds $6,850. If you’ve been allocated 4.1 staff members, and you only employ 4.0, then you’ll be given $5,450 in additional operational funding.

Schools have no control over how much any teacher is paid, and, for the most part, it makes no difference to a school how much their teachers are paid. You can have all your staff at the top of the pay scale and you consume the same number of FTTE as a school whose staff are all at the bottom of the pay scale.

Where this gets particularly complicated is that Boards can decide (subject to various conditions) to employ staff directly from their operational funding. When this is done, the school still has no control over how much the staff member is paid, but pays the real cost of that staff member’s employment.

The lowest pay for a full-time primary teacher is currently $47,039 per year, and the highest (other than ‘units’ and other additional payments) is $74,460.

Going back to the example where you were entitled to 3.9 FTTE and employed 4.0 – if you have a teacher at the bottom of the pay scale, and employ them (for 0.1 of their time) from your operations grant, this will cost you $4,703 (plus various other incidental costs).

This is around $2,000 less than if you simply overused your staffing entitlement, so all around the country principals employ this technique – as long as you have staff that earn less than approximately $68,500 (which is quite high on the pay scale) you pay them via the operations funding (it makes no difference to the teacher) and balance your staffing entitlement to zero.

If you don’t have any staff earning less than that then you can overuse your staffing (up to the 10% limit) and pay less than you would need to otherwise.

This is all rather complicated (and it doesn’t even go into paying for other types of staff, casual staff, units, holiday pay, study grants, or anything else like that). One of the reasons that Novopay was such a mess was that the system is overly complex.

The “global budget” proposal in reality

The intention is to simplify the system, so that instead of having staff paid from both a staffing entitlement and from operational funding, there’s a single staffing credit system.

Schools would be allocated a number of staffing credits each year. There’s no more or less money available, and the formula for credits would be (other than changes from the other 6 proposals) essentially the same formula as for FTTE.

A staffing credit would be valued at the national average teacher salary. For example, if the average was $70,000, then rather than 3.9 FTTE, the school would receive $273,000 in staffing credits. It’s important to note that it’s irrelevant what the teachers at the school are paid – only the national average matters.

Schools would still never touch any of this money. When teachers are paid, schools would be charged at a rate equal to the national average teacher salary. So if your school employed 3.9 equivalent full-time staff, then the school would be charged $273,000.

You can see that this balances in exactly the same way that FTTE does. From the Ministry’s point of view, the amount of money that they are paying stays exactly the same. They also don’t care whether any individual school has low-paid teachers or high-paid teachers, because it’s only the national average that makes any difference.

If all the high-paid teachers in the country were replaced with low-paid teachers, that would save the Ministry money (and make no difference to the cost to any individual school), but there’s an admirable separation (both now and in the proposed system) between the people that decide to hire someone and the organisation that bears the cost of that decision.

A “banking” system would still be available, so that you can spread your staffing credit usage throughout the year. There would still be limits on how many credits can be banked, and how much of your credits could be turned into cash if not used for employing teachers. You could “purchase” additional credits to employ extra teachers, just as you can overuse your staffing entitlement or employ staff from the operational funding.

There are two key differences:

Support staff get incorporated into the same staffing credits system. Schools still determine what these staff are paid, and would “purchase” enough staffing credits to cover whatever is required.
The complex system where all the principals share with each other the trick about making sure you have the right staff paid by the operational funding and by your staffing entitlement goes away. This is cleaner, but will cost most schools a small amount more, because most schools overuse rather than underuse, and most schools have enough teachers earning less than the national average that they can currently save through this loophole.

Realistically, the Ministry could decide at any time to close this loophole by just adjusting the rules around banked staffing. The Ministry could also decide to improve the funding proposal by charging instead at a much lower rate, to encourage employing more teachers (just as they currently discourage underuse of the staffing entitlement by refunding at a low rate).

In the existing system, you’re able to employ the most “expensive” teachers you want, and as long as you have enough “cheap” teachers to cover your employment beyond the staffing entitlement, you “win”. In the new system, you’d be able to employ the most “expensive” teachers you want, without needing to have those “cheap” teachers as a balance. However, this does come at some additional cost.

(For what it’s worth, all the principals I have worked with believe that there are benefits to employing teachers at all positions along the pay scale).

Theoretically, a school could employ many support staff instead of teaching staff, but it’s likely that there would be limits around this (just as there are with everything else).

ERO are still going to be reviewing that schools are using their staffing entitlement appropriately, principals (and their management teams) are still going to be the ones deciding how the staffing entitlement is used, and boards (who have a majority of members elected by parents) will still be monitoring and approving what management does in terms of staffing.

There’s a reason that this is only a supplementary proposal, and not one of the main ones. It changes very little. It’s basically just intended to clean things up.

A rebuttal of the spinoff article

Firstly, this is not bulk funding, and the Ministry is very explicit about that. The fundamental aspects of bulk funding are that it matters how much each of your teachers are paid and that if you employ fewer teachers you have money available that you can spend elsewhere. Neither of those are true under the proposal “global budget” system – you are funded and charged at a national average regardless of the true cost of employing the staff member, and you cannot simply convert your staffing credit into operational grant funds (other than in the limited way that you can already with FTTE).

Eden says that the proposal will result in fewer teachers (and therefore larger class sizes and everything that goes with that). There is no evidence to support this. The same amount of funding will be provided for staffing and schools cannot simply take the cash instead.

There is evidence that counters this. Right now, schools can choose to understaff by 10% and use that additional money for whatever other purposes they wish. Almost no-one regularly does this (you can ask if you want current numbers), and in reality boards are doing exactly the opposite.

Eden says that there will be fewer teacher aide hours. Teacher aides are (generally) not paid from the staffing entitlement, but from operational funding (typically provided by grants from various organisations). The global budget proposal contains no suggestions that the funding amount for teacher aides would change in any way (other proposals could result in changes). The amount spent on teacher aides would not be impacted by this proposal in any way.

Eden claims that it would likely mean untrained teachers in the classroom, because they would be cheaper. Firstly, schools cannot employ untrained teachers – there are strict regulations about teacher registration, and there are requirements for beginning teachers to demonstrate not only that they are trained but are qualified and meet all of the expected standards. Assuming that she is disparaging trained, qualified, beginning teachers instead, this is also untrue. There is no benefit to a school to employ a beginning (‘cheap’) teacher over an experienced (‘expensive’) teacher, because the school pays the same for them either way, just as they do now. The opposite is actually possible, because the need to have ‘cheap’ teachers to use for paying for staffing overuse will be removed.

The section “So what is the Ministry of Education’s proposal” is flatly wrong. So is the idea that there are “caps to class sizes” (staffing is based on a class size that the Ministry believes is appropriate, but it’s up to each individual school to decide how the staffing is actually utilised).

There will be no decision about whether to pay for power (which is funded separately and adequately in the heat, light, and water component of the operations grant) or a teacher. Her points around property funding changes are to do with one of the other funding proposals (supporting #2), not the global budget. Changing the decile system is also an entirely different proposal (core #2). The rest of the post then devolves into a rant about how education is insufficiently funded, which is true, but nothing to do with the global budget proposal, since that does not change the amount of money available in any way.

It’s great to discuss how we can improve education in New Zealand, including how funding works. But let’s do it with carefully researched facts, not something that you’ve half-heard in a union meeting.

My background: I taught in the tertiary education section for a while, my father was a teacher, my mother and sister are teachers, and many other family members are teachers or work in some sort of education role. I have been on the Board of Trustees at Ahuroa School for around 7 years, as chair for around half that time (including currently). The facts above are sourced either from publicly available information directly from the Ministry, or directly from the Ministry staff that are overseeing the funding review, at one of the workshops that was held in July 2016.

This post is not a statement from Ahuroa School; please contact chair@ahuroa.school.nz if you would like to have one that is.