It's no coincidence that Docker looked like a package manager if you squinted hard enough (Solomon was the co-founder of Docker). Parallels and shared nomenclature between DockerHub and GitHub, docker registries. Docker builds made the package management problem evident – slow and clunky containerized installs, difficult caching, and operating-system-level package managers that were never meant to be used in that way.
I've written about incremental changes we can make in the package management ecosystem (GitHub's Missing Package Manager) and the importance of package management in general (Package Managers and Developer Productivity).
What open problems might a "universal package manager" solve?
- Satisfiability, Dependency Hell, and NP-Completeness. How do you build a package such that all dependencies are satisfied and each pair of dependencies is compatible? Russ Cox has a series of blog posts (read here) that motivated the need for Go modules. See my Nine Circles of Dependency Hell for a list of what could go wrong.
- Non-standard and idiosyncratic behavior. Yarn's lockfile is almost YAML (but it's not). Pip dependency resolution is not guaranteed to be reproducible.
- Slow installation and non-optimal or bespoke caching. Installing npm packages in development is a much different process than installing them in CI or production. Learning how to write an optimal Dockerfile that includes a language-specific package manager can be difficult and painful (even for experts). Caching methods and infrastructure need to be carefully planned and are different for every language.
- Fragmented ecosystem. If you are serious about your organization's package management, you likely need to host internal versions of pip, npm, cargo, or whatever language package managers your team needs. Some SaaS vendors (e.g., Artifactory) will do this for you, but good cloud services from the hyperscalers don't exist.
What might a universal package manager look like?
- A standard interface akin to Language Server Protocols. Maybe there's no single implementation that works across all languages, but there are common operations that all package managers could implement (querying dependencies, calculating checksums, etc.).
- Arbitrary DAG execution and caching layer. My enthusiasm for Docker Buildkit is unmatched (here, here, here, and here). Buildkit can do this, but the tricky part is figuring out what UX makes sense (it's not a Dockerfile). For example, you'll probably want content-addressable caching (Buildkit does this already).
- Declarative and reproducible. Most package managers are approaching a declarative model but aren't there yet. Declarative configuration is necessary in a world with ephemeral cloud resources. Reproducibility is essential to avoid development/production parity problems (but note that reproducibility is a spectrum).
- Shared libraries for dependency resolution. Even if dependency resolution must be different across languages (due to culture and language quirks), the implementation of the algorithms that power dependency resolution can be shared across languages.
- Simple layer over cloud primitives. Package management should be a simple API over cloud storage. Hosting a package repository should be as simple as "bring your own S3".
- No redundancy in publishing. I touched on this in GitHub's Missing Package Manager, but publishing packages from the source should be simple. Today, you have to rewrite build scripts and provide metadata to every service from which users consume your package. That process is manual and error-prone (not to mention insecure).