Early-stage startups shouldn't run on Kubernetes yet.

But eventually, growth-stage and large companies should be running on Kubernetes in some form. Kubernetes Maximalism doesn't mean one-size-fits-all.

Infrastructure should progressively grow with your workloads and team. How can you choose the right technology now so that you can maximize growth and minimize pain later when you inevitably outgrow it?

This is a deeper dive into one area of the infrastructure stack: container abstractions. There are tons of ways to run containers on cloud, so it's especially tough to pick the right abstraction at the right time. I'd roughly classify them into four categories:

A guide to choosing the right container abstractions broken down by engineering teams that are 1e0, 1e1, 1e2, and 1e3+ engineers.

1e0 ≤ team_size ≤ 1e1

Let's take the example of a small team. The developers might have some DevOps experience, but everyone's essentially an SRE. There might be a simple CI/CD pipeline but a limited focus on reproducibility or air-gapped environments. You can get far with serverless functions and event-driven architectures, but you'll probably need a long-running daemon at some point.

I'd be careful with the all-in-one options like AWS App Runner or any service that promises code-to-container-to-deployment. For any team building anything other than a simple web service, you'll run into a wall quickly with those services.

Be wary of simplicity that is hyper-opinionated optimization in disguise – Optimization is Fragile.

My advice for this team: start with serverless container runtimes. On AWS, that would be Fargate on ECS, or on Google Cloud, Google Cloud Run.

The downsides are that you'll have to build and upload container images. While many higher-level services will pack up your code and turn it into a container, I don't suggest using them. Once you hit the configurability cliff (e.g., needing to change something that the builder abstracts), you take on all of the complexity that you thought you avoided, all at once.

In my experience, these services can be difficult to work with if you use the UI. I'd suggest provisioning them in code with something like Pulumi or AWS CDK.

You don't need a fully baked CI/CD pipeline. It's OK to build and deploy containers locally or with a simple script on GitHub actions. In the Spectrum of Reproducibility, you only need weak guarantees. While not reproducible and many foot guns, Docker images are good enough for small teams.

1e1 ≤ team_size ≤ 1e2

I'd suggest that teams adopting Kubernetes (even the managed versions) have an SRE team, or at minimum, a dedicated SRE engineer.

Reasons you might outgrow a serverless container runtime

The thing about Kubernetes tooling is that: (1) there are a lot of APIs to build upon, (2) that results in a Cambrian explosion of tools for which (3) not all of them will be useful.

1e2 ≤ team_size ≤ ??

Large engineering teams may want to run Kubernetes on bare metal or cloud.

You'll probably need a dedicated 1e2 DevOps team if you're going down this route. Or, you might be a company exposing Kubernetes in some way to your customers (e.g., a platform service or IaaS-like provider).

Some reasons why you might want to run Kubernetes yourself.

My advice: be careful with the internal platforms and abstractions you build on Kubernetes. Even the best snowflake infrastructure eventually suffers from diseconomies of scale (see Diseconomies of Scale at Google). You shouldn't be wasting engineering cycles competing with or recreating products already offered by cloud hyperscalers.