Hi, I'm Pierce Freeman. I'm a ML engineer and founder.
Get in touch: pierce@freeman.vc
~ ~ ~

Typehinting from day-zero

July 11, 2024

Static typehinted languages can make us lazy about adding types at the right time. We have all the context when we start a new project, but as we increase complexity and focus on other things that context wains. Rewards compound from typing on day zero.

Legacy code and AI copilots

July 11, 2024

In addition to the core language, code LLMs also have to interplay with an ecosystem of constantly changing dependencies. These package versions constantly change in features, with different functions and syntax. What are some long-term approaches to making coding assistants more aware of the package ecosystem?

Generating database migrations with acyclic graphs

July 3, 2024

Mountaineer 0.5.0 introduced database migration support, so you can now upgrade production databases directly from the CLI. It generates SQL for you automatically instead of writing manual table migrations, and removes the need for third party packages to support the same functionality. Let's dive into the details of how we implemented the engine.

Mountaineer v0.1: Webapps in Python and React

February 27, 2024

Today I'm really excited to open source a beta of Mountaineer, an integrated framework to quickly build webapps in Python and React. It's initial goals are quite humble: make it really pleasurable to design systems with these two languages.

Constraining LLM Outputs

February 20, 2024

LLMs are by definition probabilistic; for each new input, they sample from a new distribution. Even the best prompt or finetuning will minimize (but not fully resolve) the chance that they give you output you don't expect. This is unlike a traditional application API, where the surface area is known and the fields have a guaranteed structure.

Passthrough above all

February 15, 2024

In the Vision Pro, there's sometimes a conflict between the window's existence and your own passthrough reality. Try to place one in a room and then walk through a doorway, peeking back at the room from within the door frame. Practically speaking, it's better to keep the reality of what people are actually seeing than to keep the reality of the augmented reality.

How quick we are to adapt

February 6, 2024

Most of the people I know in San Francisco have used a Waymo at least once. Many friends of mine swear by them. The fact they're self driving doesn't really enter into the equation: they just prefer the product they're being offered when they're picked up.

The curious case of LM repetition

January 22, 2024

I was doing some OSS benchmarking over the weekend and was running into an odd issue. Some families of models would respond with near-gibberish, even with straightforward prompt inputs. This is a debugging session for LLM repetition.

Debugging chrome extensions with system-level logging

December 19, 2023

Extensions are basically mini web applications these days, just with access to a `chrome` global variable that can interact with some browser-level functionality. Aside from that - it's all familiar. That extends to the debugging experience. Since extensions run in the regular V8 Chrome runtime, Chrome exposes the same debugging tools that you're used to on the web.

Speeding up runpod

December 18, 2023

One issue I've occasionally observed on Runpod is varying runtime performance box-to-box. My working mental model of VMs is that you have full control of your allocation; if you've been granted 4 CPUs you get the ability to push 4 CPUs to the brink of capacity. Of course, the reality is a bit more murky depending on your underlying kernel and virtual machine manager, but usually this simple model works out fine.

Inline footnotes with html templates

December 17, 2023

I couldn’t write without footnotes. Or at least - I couldn't write enjoyably without them. They let you sneak in anecdotes, additional context, and maybe even a joke or two. They're the love of my writing life. For that reason, I wanted to get them closer to the content itself through inline footnotes.

Parsing Common Crawl in a day for $60

December 14, 2023

In addition to forming a bulk of the foundation of modern language models, there's a ton of other data buried within Common Crawl. Incoming and external links to websites, referral codes, leaked data. If it's public on the Internet, there's a good chance CC has it somewhere within its index. Here we parse all of common crawl in a day, on the cheap.

The Next 10 Years

August 24, 2023

Personal notes for where we're headed over the next 10 years. While the future is never written in stone, I'm 90% sure of these outcomes. Past a decade, my confidence diminishes significantly.

Adding wheels to flash-attention

August 20, 2023

flash-attention is a low level implementation of exact attention. Unlike torch, which processes attention multiplications in sequence, `flash-attention` combines the operations into a fused kernel, which can speed up execution by 85%. And since attention is such a core primitive of most modern language models, it makes for much faster training and inference across the board. It now has an install time that's just as fast.

LLMs as interdisciplinary agents

May 26, 2023

The real breakthrough with large language models might not be exceeding human levels of performance in a discrete task. Perhaps it's enough that they can approach human level performance in a variety of tasks. There might be more whitespace in intersectional disciplines than aiming for true expert status in any one.

Representations in autoregressive models

May 11, 2023

One of my more memorable CV lectures in college opened with a declaration: representations in machine learning are everything. With a good enough representation, everything is basically a linear classifier. Focus on the representation, not on the network.

Let's talk about Siri

April 28, 2023

Last weekend I spent some serious time with Siri for the first time in a couple years. A lot has changed since I last took a look. Since iOS 15, all NLU processing is done locally on device. There's a local speech-to-text model, a local natural-language-understanding module, and a local text-to-speech model. All logic appears hard-coded and baked into the current iOS version.

Minimum viable public infrastructure

April 27, 2023

You can't iterate when you're building huge things. You also can't tolerate failure in the same way. You don't want a bridge constructed in a month only to fall down the year after. The bulk of the bureaucracy for infrastructure is making sure projects meet this bar of safety; safe to use, safe to be around, and safe for the environment. There's no such thing as MVP Public Infrastructure.

Reasoning vs. Memorization in LLMs

April 13, 2023

By virtue of their training objective, LLMs are optimized to model language and minimize the perplexity of examples. Memorization of input facts is an expected biproduct of this pipeline. General reasoning skills are the more unexpected emergent property.

Automatically migrate enums in alembic

March 30, 2023

If you're using SQLAlchemy as your database ORM, there's a good chance you're using Alembic to migrate across revisions. Alembic doesn't support enums out of the box. Keep enum values in code synced up with database values.

Greater sequence lengths will set us free

March 20, 2023

GPT-4 represents the latest leap in LLM sequence length. Doubling down on longterm dependencies might be the advance we need for real business value and machines that operate closer to humans.

On learning to ski

March 8, 2023

Common wisdom says children explore while adults exploit. At some point, we tend to transition from one to the other - perhaps because of risk intolerance, time limitations, or sheer laziness. I learned how to ski this year, which was the first new sport I've picked up in at least a decade. Some thoughts on learning new things and throwing yourself down mountains in the process.

Using grpc with node and typescript

February 16, 2023

Most of the grpc docs use the dynamic approach - I assume for ease of getting started. The main pro to dynamic generation is faster prototyping if the underlying schema changes, since you can hot reload the server/client. But one key downside includes not being able to typehint anything during development or compilation. For production use compiling it down to static code is a must.

Opportunity years

February 15, 2023

The last few months have been tough for a lot of people. Layoffs, down rounds, and bankruptcies jolt the expected progression of life. Decisions that were within grasp are now no longer. Despite the environment, people in tech are more optimistic than the media might lead you to believe.

Buzzword peaks and valleys

February 14, 2023

It's 2023 and once again, we are all in on AI. This is thanks in part to the cultural phenomena that is ChatGPT. Many companies are racing to deploy AI models (generative where possible) just to put it on their slide deck. Like clockwork, three years later, we've reverted back to AI. It sometimes feels like we're back in 2017.

Network routing interaction on MacOS

January 2, 2023

There are a series of resolution layers governing DNS, IP, and port routing on OSX. Included are notes on the different routing utilities supported locally, specifically using /etc/hosts, ifconfig, pfctl, and /etc/resolver.

The provenance of copy and paste

December 19, 2022

Copy and paste is ubiquitous. A topic that receives less attention, however, is the provenance of data that flows into and out of your clipboard. I often find myself going through documents that I've written or were written by colleagues. I almost inevitably have to wonder where in the world some of the data came from. A thought experiment for a copy and paste implementation that retains a history chain going back to the original source.

Debugging tips for neural network training

December 16, 2022

Practical notes for debugging more complicated training pipelines and architectures, informed by pure research and productionalizing models in industry. This guide has a bias towards debugging large language models.

AWS vs GCP - GPU Availability V2

November 14, 2022

A revised comparison between GPU availability for AWS and GCP. Includes some internal strategies for GCP request allocation. Updated benchmarking numbers.

Independent work: October recap

November 5, 2022

It's been a month since going full time on my own thing. In some ways I'm surprised by how natural the transition has been. This is a short progress update on the first month of going independent. Finished a first launch of GrooveProxy with some progress on Popdown.

Relationship modeling

October 25, 2022

Given the pandemic's isolation of friends and friend groups, I've been thinking a lot about relationships. Which ones fulfill, which ones entertain, and which ones are resilient to strain. Why do we spend so much time talking about the past or trying to predict the future?

The power of status updates

October 19, 2022

There's a reason why dashboards have become increasingly common over the last decade. Hearing from people with more context can immediately dissolve fears. In that way trains have a lot to do with status pages.

A new chapter

October 13, 2022

Last week I said goodbye to my colleagues at Globality after five years on their engineering team. It's hard to believe it's been so long. I still remember my first day perfectly - no laptop, no desk, not even a manager to greet me. I ended up writing my first PR on a personal computer in the kitchenette. I've been putting a lot of thought into what I want to focus on next. Here's my current list.

Give my library a coffee shop

September 28, 2022

Libraries might be one of the greatest assets in modern America. They're free, have an extensive selection, provide technological support, dot cities and rural counties alike, and are often beautifully architected. Their physical spaces are also increasingly underutilized.

AWS vs GCP - GPU Availability V1

September 21, 2022

Cloud compute is backed by physical servers. And with the chip shortage of CPUs and GPUs those resources are more limited than ever. After encountering some reliability issues with on-demand provisioning of GPU resources on Google Cloud, I put together a benchmarking harness to test AWS vs. GCP availability.

Headfull browsers beat headless

September 7, 2022

Twenty years ago a simple curl would open up the world. HTML markup was largely hand designed so id and name attributes were easily interpretable and parsable. Now most sites render dynamic content or use template defined class tags to define the styling of the page. Building a headfull browser container to more easily deploy and debug Chromium in a remote cluster.

Webcrawling tradeoffs

September 6, 2022

A couple of years ago I built our internal crawling platform at Globality, which needed to be capable of scaling to billions of pages each crawl. The two main types of crawlers that are deployed in the wild are typically raw or headless. We ended up implementing a hybrid architecture. Hybrid crawling can make use of the strengths of both while trying to minimize their weaknesses.

Busses can fool me thrice

August 30, 2022

Public transit is often framed as necessary philanthropy for cities. It cuts down on cars and pollution at the expense of convenience. If people can more efficiently get to their destination by other means, they will. This is the wrong way to look at things. For public transit to really work, it needs trust. The main KPI for a transit system has to be adherence to schedule.

Falling for Kubernetes

August 7, 2022

I default to bare metal where I can. But recently I had to adopt a more complicated server management solution. And after a couple months of building for kubernetes, I must admit I'm falling for it more every day.

Content that I'm obsessed with

August 2, 2022

A constantly updating collection of content that I highly recommend to others. Movies, TV, and Books. Updated occasionally if something has staying power of more than 3 months.

Remote work is a better tourism

June 9, 2022

Over the pandemic I've been able to work from a variety of places. I've vacationed to most before. In almost all cases, I've vastly preferred working there. It gives you the encouragement to do what locals do. You're way more likely to meet people who live there if you engage them where they're most likely to be doing work and living lives themselves.

The new opportunity in travel

May 29, 2022

There is going to be a new class of travel option: working by day and socializing by night. This model upends traditional tourist activities since it encourages a participation in local cultural life, like the working professionals that live in that city full time.

Labor markets calibrate satisfaction

May 10, 2022

What explains the differing pay between talented people in different careers? Something is clearly lost in our typical conversation about what a salary includes.

Installing FastText on an M1 Mac

May 5, 2022

We rely on FastText in some of our NLP microservices. Since upgrading to an M1 Macbook, these dependencies have failed to build wheels.

Architecting a blog

January 4, 2022

An obligatory post on blog architecture. I started focusing more on writing this year and wanted to rethink my workflow to make it a bit more frictionless. I started with the writing experience that I wanted in my IDE and moved on to the markdown compilation tooling.

Write where you are

January 3, 2022

Publishing has always been my bottleneck. During stints on Wordpress or Medium, I was overly focused on how articles looked that it often got in the way of what they said. This year I want to change that trend.

Treat engineers as users

December 30, 2021

It's an underemphasized asset of successful engineering startups: they make developing enjoyable. More companies need to follow their lead and treat their internal teams like users. Give them a UX that they can enjoy.

Scoping an ML feature

April 26, 2021

Most confusion when building ML features comes at the beginning of a project. The goals are vague, the data isn’t in the expected format, or the metrics are ill-defined. This is a key place for product managers to articulate user needs in a way that machine learning researchers can translate into a well-defined research problem.

AI needs a better definition

April 14, 2021

People label AI as anything and everything these days. You have search systems, you have process automation, you have spam filters. If motion activated supermarket doors were invented today, I guarantee they’d be branded AI too.

NFTs are nothing new

April 5, 2021

NFT's have exploded into mainstream conversation over the last few weeks. Like with everything in crypto, you have strong bulls and equally strong bears on the investment thesis. Are the principles behind NFTs anything new? And what can collector culture tell us about the investment opportunities with these new tokens?