PyTorch

Speaker: Zachary Devito

How Usability Improves Performance in PyTorch

Performance? No - it was initially 20% slower than alternative approaches.

Innovative new algorithm? No, used autograd approach developed elsewhere.

Answer: "laser focus" on usability for developers

Exponential growth in efficiency of algorithms (faster than Moore's law) means productivity is more important than performance now.

Don't compromise usability for potential performance gains.

Real networks do not always have fixed sizes ... but many libraries do!

E.g. images not the same size, but batches are rectilinear. Same with NLP.

Often use padding or scaling, but this is not hardware-efficient.

A surprising amount of dynamic behaviour and sizes occur in real world models?

So when is it ok to restrict this dynamic behaviour?

Add restrictions when there are already-realised performance gains. But me much more sceptical when the gains are theoretical.

Create self-contained archives of trained PyTorch programs for transfer learning, or deployment

To address these:

Introduced torch.package that provides self-contained eager-mode models without harsher restrictions of Torchscript