I remember when TensorFlow was released in 2015. Kubernetes was released around the same time (part of Google's reasoning for open-sourcing both was to not make the same mistakes they did with Hadoop/Map Reduce – see Diseconomies of Scale at Google). It was a time when many of the deep learning models (Inception, ResNet, other CNNs, and DNNs) were built with TensorFlow, and the industry rallied around the framework. Facebook released PyTorch a year later.
Since then, PyTorch seems to be growing faster than TensorFlow.
Why did PyTorch seem to win?
- A more collaborative project – TensorFlow accepts the occasional outside contribution, but development is led internally by Google. External contributors were often blocked by failing internal tests that they couldn't debug.
- An imperative vs declarative API. While declarative APIs can sometimes be more optimized and purer, imperative APIs are usually simpler to use.
- There's so much more to the model than model design. Arguably, the "hard" part is often all the other things: figuring out training at scale, debugging, and the deployment pipeline.
Why might TensorFlow still win?
- Facebook does not design its own chips. Google has TPUs, which can be optimized for TensorFlow (and vice versa). Facebook has joined companies like Microsoft and AMD in a partnership called Onnx to do something similar.
- TFLite is still bounds ahead for mobile deployment of models. Google's organizational knowledge of building and operating Android seems to help.