Typehinting from day-zero

# July 11, 2024

I started programming with C++ and C. Ignoring C's pointer dereference and typecasting hell, both were compiled languages and had natural support for type definitions. But in the startup sphere they fell out of favor in a switch to full-stack development in Node/Javascript and data heavy processing logic in Python. The "move fast and don't compile" philosophy seemed to become the industry default.

At Globality, I spent the majority of time working on codebases that weren't typehinted. After awhile I stopped noticing; or at least thought that I stopped noticing. I could more-or-less guess what variables were doing based on the keyword arguments, docstrings, and naming conventions. But inevitably a variable would be passed somewhere that had an unintended side-effect, or you would change a schema and need to wade through downstream client definitions. Only runtime errors and extensive tests could tell you when you've done something wrong.

Into this interpreter void came typehinting. Unlike compiled types, typehints are just soft suggestions. They're simple, inline annotations of what we intend for the values to mean. They're intended to be human readible more than precisely specifying the memory storage format.

def my_function(a: int):
    return a * 2

my_function(10) # runs fine
my_function("10") # runs fine, but a static typechecker should flag

Runtime execution succeeds in both of these cases. The first result will resolve to 20 and the second to 1010. But specifying these annotations allows static analysis to occur on the abstract syntax tree. Now that we know the first argument of my_function should be an integer, we can flag invalid calls where the known types violate the function signature.

Typehinting in these languages isn't particularly new1 but they're still not the default behavior when starting new projects. It's contagiously easy to create a new js or py file, write some code, and run it. Before you know it you have 25 of those refactored files and need to spend a Saturday going back and adding types.2

Typehinting is a funny problem where it's easy to do in the moment and much harder to do in hindsight. You lack the context of what you wanted the function to do in the first place, what callers in the system might be using it to do today. Such decisions are magnified by these dynamic languages willingness to automatically cast most input variables. Plus there's the organizational problem. Once non-typehinted code is shipped and working, it's hard to justify the effort of going back and re-adding them. Much like back-porting tests. It catches future issues but doesn't seem to make a difference to the product today.

But typehinting is magical when it works well. I've lost count of the amount of downstream attribute errors I've caught by switching to @dataclass definitions3 and noticing breaking behavior whereas an untyped dict would have failed silently. The same goes for calling third party functions. If their syntax changes across minor versions, static typehinting is always my first sanity check of library compatibility.

Perhaps as important, and opinions certainly vary on this, but I find it makes code far more readible. At best it sounds like prose:

okay, this int variable value gets multiplied by two

Versus without types:

okay, this passed value falls back on python's default * handling"

Even without executing the code, one statement is much more obvious than the other.

Typehinting my code has been something that I force myself to do from the get-go these days, even on small side projects. Since even side projects can far exceed your expected scope. And why not add a little bit of developer joy to your own life?


  1. PEP484 was first drafted in 2014 and introduced in Python 3.5, and Typescript has been around since 2012. 

  2. Can you tell this is how I've spent a few weekends? 

  3. Or interfaces in Typescript, which are much more more flexible in their definition syntax. But they're not able to be inspected at runtime. So like everything there are trade offs. 

Related tags:
#statup #programming #types
The curious case of LM repetition
I was doing some OSS benchmarking over the weekend and was running into an odd issue. Some families of models would respond with near-gibberish, even with straightforward prompt inputs. This is a debugging session for LLM repetition.
Inline footnotes with html templates
I couldn’t write without footnotes. Or at least - I couldn't write enjoyably without them. They let you sneak in anecdotes, additional context, and maybe even a joke or two. They're the love of my writing life. For that reason, I wanted to get them closer to the content itself through inline footnotes.
Network routing interaction on MacOS
There are a series of resolution layers governing DNS, IP, and port routing on OSX. Included are notes on the different routing utilities supported locally, specifically using /etc/hosts, ifconfig, pfctl, and /etc/resolver.

Hi, I'm Pierce

I write mostly about engineering, machine learning, and company building. If you want to get updated about longer essays, subscribe here.

I hate spam so I keep these infrequent - once or twice a month, maximum.