~ ~ ~

AI needs a better definition

# April 14, 2021

Have you tried to define Artificial Intelligence lately? Go ahead and find a common thread linking the tens of thousands of marketing websites that scream it from the rooftops. I’ll wait.

People label AI as anything and everything these days. You have search systems, you have process automation, you have spam filters. If motion activated supermarket doors were invented today, I guarantee they’d be branded AI too.

The academic community certainly doesn’t help provide a clear definition. AI encompasses a broad range of research pursuits. It covers search systems, computer vision, chatbots, natural language processing, robotics, and game playing. It’s typically a department or subdepartment at most universities, like you would find System Design or Information Theory. When’s the last time you heard your manager ask to apply “systems design” to solve a problem and then they left it at that?

Then again, academia has never been particularly interested in domain clarity. But as Artificial Intelligence and Machine Learning break out from universities and into brand strategy whiteboards, this vagueness becomes a pivotal problem in industry. Doing so confuses the definition of objectives, misplaces stakeholder expectations, and leads to the wrong assumptions about what engineering owns and what research groups own.

It’s time for a clearer definition of AI.

AI in industry is a world apart from academia

I’m going to let you in on a secret of how AI models are built in industry.

You might imagine a researcher analyzing datasets, finetuning the perfect model, and then shipping it to clients. After all, that’s what we typically do in the academic world or on sites like Kaggle. These problems have clear scopes that lend themselves to straight machine learning. You have input data and an expected output, usually a few labels or sometimes a continuous value. If your model does well against the metrics, you’re done. Real problems are not nearly as clean.

Consider self-driving cars. You could train a model that goes directly from camera input to steering predictions. Some have. But automakers that are actually selling self driving vehicles broke this complex challenge into tens of subtasks: lane detection, obstruction detection, speed limit detection, etc. They are benchmarked and optimized separately. A control system uses these inputs in a series of rules that specify how the car’s physical controls should respond. This is strategic.

With all machine learning, there’s some probability of failure. When lives and livelihoods are at stake, you want your system to be:

Interpretable: Systems should allow for some introspection to uncover why it’s failing. Is there bias involved? Is it good at lane detection but bad at speed limit detection? If an end-to-end system is too uninterpretable, researchers in industry will work to break down the problem into clearer model definitions.

Deterministic: Systems should perform in a similar way, given similar input. You don’t want there to be some probability that your self-driving car will hit a person in a crosswalk. If a sensor detects that there’s a person in your field of view, you want to stop. One hundred percent of the time. This is the realm of regular programming where you can engineer the behavior of a system and set a fixed behavior, given some clear input.

Let’s start with something that is well-scoped

ML has a clear definition. You have some input and you get some output. The internals might be complicated, but at the end of the day it’s still just f(x)=y. The system learns how to build this function to be as accurate as possible, given the data it has access to. These are the key pieces:

  1. The model accepts data to correlate input to output. In supervised learning, this data is provided explicitly. In unsupervised or reinforcement learning, data is provided implicitly as the system generates it for itself. But still — it fulfills the same model contract.
  2. The model uses this data to build the function. In the future you expect to provide a novel input and get out a reasonable input, similar to how humans would handle that same new input.

The term ML doesn’t suffer from the same curse as AI. If a system fulfills these conditions, it’s ML. Otherwise it’s not.

ML fills the gaps in deterministic systems

At the end of the day, you want your product to solve a problem for your users. With enough need-finding, you’ll develop a point of view for what this solution will be. Product leaders and engineers have an intuitive sense for how to program functions that automate this point of view. Software has been doing this for the last four decades.

The missing puzzle piece is when you can’t program the solution. You can clearly define the input / output contract of a system, but have no idea how to get from point A to point B programatically. You can do it as a human but you have no hope in articulating how a machine can do that same thing. That’s where machine learning really shines.

All else being equal, programming is better than machine learning. With programming you can inspect codepaths. You can find edge cases and test for them. The performance doesn’t change over time. If you can possibly code a deterministic solution to a problem, do that. Don’t leave it up to a pattern recognition system.

A clearer proposal for defining AI in industry

Okay, so let’s consider these two conditions. We want our system to be as interpretable as possible, which is where programming is most practical. Our system might have aspects that need to be learned from the data, thereby necessitating machine learning. A system that combines these both into a solution for users — that should be the working definition of Artificial Intelligence for use in industry.

Artificial Intelligence is an integrated system that combines rules and ML to deliver value to an end user.

This definition fits all of the truly intelligent systems that I’ve seen in industry. There’s an interplay between these two tools where machine learning fills in the weaknesses of programming, and vice versa. The “secret sauce” of these systems is typically the machine learning model, but don’t undersell the importance of the surrounding rules.

AI is the glue that holds everything together. Nothing more — nothing less. If you don’t have a ML portion of your system, don’t call it AI.

Closing

Some intelligent systems deployed within industry, and some research areas within academia, might be excluded from this scope of AI. That’s the point. To make a definition useful it has to draw a prescriptive line defining what it is and what it is not.

This definition sacrifices some breath to gain a whole lot of clarity when actually working with these systems. When you’re in the next meeting that’s asking for more AI in the product, unpack where ML will fit into this system and where the rules are intended to be. If it only has ML, then just call it ML. If it only has rules, call it programming. Reserve AI only for the intersection.

Remember: If everything is AI, nothing is.

Stay in Touch

I write mostly about engineering, machine learning, and company building. If you want to get updated about longer essays, subscribe here.

I hate spam so I keep these infrequent - once or twice a month, maximum.