There's a curious case of a neural network for object recognition called YOLO – You Only Look Once. While many object detection models were two-pass (one for identifying bounding boxes, the other for classifying), YOLO was single-pass. This makes YOLO fast and small.

I used a modified version of YOLO for my model ScapeNet: Real-time Object Detection in Runescape. Except, my YOLO wasn't really the same YOLO. There are almost a dozen different models, each claiming to be YOLO, written by other authors.

Which is the real YOLO? Does it matter? What makes a model "win"?

YOLO (v1, 2015) was originally written by Joseph Redmon, who wrote it in his own neural network framework, Darknet. He would later update it in v2 (2017) and v3 (2018).

YOLOv3 was forked by researchers at Baidu in a model called PP-YOLO.

YOLOv4 (2020) was released by a different author, Alexey Bochkovskiy. This repo is a fork of Redmon's original repository and is closest in architecture.

YOLOv5 was written by Glenn Jocher, and implemented in PyTorch.

Meituan released a model MT-YOLOv6, which is also called YOLOv6.

Bochkovskiy (author of v4) also released a new model called YOLOv7.

Machine learning model naming is tough. Most users won't dive into the architecture of how it works. Some versions might differ on "non-research" elements: better developer experience, different implementation or framework, or different end-user API.

The threat of the hard fork might be even greater with open-source model architecture.