Sharing a Notebook

The state-of-the-art in generative AI is advancing fast. But, unlike previous AI waves marked by big launches and research papers, generative AI is spreading in a much more grassroots (and unlikely) medium: through Google Colab notebooks.

Google Colab notebooks are free Jupyter notebooks that run in the cloud and are easy to share. Many people use them to tinker with models, experiment with code, and share ideas. Interestingly, it was launched by Google Research during the time I worked on Google Cloud AI (we shipped a similar but unbranded Jupyter workflow).

So why are Colab notebooks the medium of exchange?

First, the base infrastructure and models are already open-sourced and developed. During the last wave, TensorFlow and PyTorch were still being incubated as solutions to the problems of deep learning. The biggest models were either closed-source or too complex for the average developer to contribute to.

This time, there’s a lot of “plumbing” work that’s being done in forked GitHub repositories that don’t require deep knowledge of machine learning or diffusion models. Those changes could be modifying Stable Diffusion code to run them on consumer M1 GPUs or creating Web UIs or user interfaces to run text2img or img2img and tune parameters. Or maybe it’s modifying the model to run in a different framework or with even fewer resources.

Second, LLMs are more consumer-friendly. Normal users and developers can make sense of the model. Inputs (prompts) and outputs (images) are more accessible to the average user than bounding boxes, vector embeddings, or NumPy arrays. Models are smaller and can be run on commodity hardware. Datasets are relatively small, or trained weights are published.

Third, diffusion models are goldilocks models for Colab — too large to fine-tune or run inference on the average laptop but small enough to run on spot instances that are given away for free.

There are some interesting implications of Colab as a medium that ML applications go viral on:

  • Security — most of these models download and run code from GitHub. They might ask for permission to access your Google Drive. It isn’t easy to know exactly what’s going on in a notebook, and there are few guarantees that it’s doing what you think it is.
  • Presentation and code — A cardinal programming rule is separating presentation and code. But sometimes it’s helpful to combine the two. I wrote about this in Presentation Next to Code and In Defense of the Jupyter Notebook.
  • Monetization – Colab is unlikely to drive real infrastructure spend for cloud. While some consumers might pay for Colab Pro+ ($50/mo), it doesn't seem like a real business model (is it Enterprise SaaS? Does it belong in the same category as Google Workspace Docs/Sheets/Mail?).  Google can subsidize Colab through other products, but in the long run, it should be self-sustainable. Maybe it follows a Hugging Face-like playbook (although it's unclear exactly what the end-result looks like even in that case).
Fuzzy Databases

Different trade-offs already give rise to significantly different types of databases – from OLTP to OLAP, relational to non-relational, key-value, graph, document, and object database (to name a few).

What if you relaxed some key properties that we've come to expect?

What if databases returned extrapolated results?

If you squint, LLMs resemble something like a vector search database. Items are stored as embeddings, and queries return deterministic yet fuzzy results. What you lose in data loading time (i.e., model training), you make up for in compression (model size) and query time (inference). In the best case, models denoise and clean data automatically. The schema is learned rather than declared.

What if anyone could write to the database?

Blockchains are databases as well. They provide a verifiable ledger and peer-to-peer write access in exchange for significant trade-offs in privacy, throughput, latency, and storage costs. Keys are hashes (similar to a distributed hash table). Authorization is done through public-key infrastructure, and a generalized computing model can be built on top of the distributed ledger (e.g., the EVM).

What if the database could be embedded anywhere?

SQLite/DuckDB answer this question. While neither can support concurrent writes from different processes and are limited in other terms of horizontal scaling, they can be easier to use and can fit in more workflows (e.g., serverless, edge, browser). In many cases, they are operationally much easier to manage than a traditional database.

You could also look at these databases through the lens of hard-to-compute, easy-to-verify.

Human-in-the-Loop and Other AI Mistakes

The 2016 influx of deep learning startups was marked by human-in-the-loop AI. Chatbots and magic assistants were powered by a human on the other side. Driverless cars with a remote driver handling most interactions.

The general playbook in 2016 went something like this:

The performance of deep neural networks scales with data and compute. Extrapolating 2016 results over the next few years shows that we'll have ubiquitous human-level conversational AI and other sophisticated agents.

To prepare for this, we'll be first-to-market by selling the same services today except with humans that are augmented by the models. While this will initially be a loss-leader, it will be extremely powerful once the models are good enough to solely power the interact (without humans). By then, we'll have the best distribution.

Of course, we still don't have the level of conversational AI that can power magic assistants or chatbots without a human-in-the-loop. Most of these startups ran out of money. The most successful ones were the ones that sold to strategic buyers (Cruise/GM, Otto/Uber, Zoox/Amazon) or ones that sold picks and shovels (Scale AI).

Extrapolating performance for ML models is challenging. More data, more compute, or different architectures don't always mean better performance (look at some of the initial results from Stable Diffusion 2.0).

We don't seem to be making the same mistakes as 2016 in the era of generative AI. Some companies are solving for distribution using someone else's proprietary model (e.g., Jasper AI/GPT-3), but these products deliver real value to customers today – with no human in the loop. If LLM performance plateaued, these companies would likely still have some intrinsic value.