minGPT icon indicating copy to clipboard operation
minGPT copied to clipboard

Adapted this awesome minGPT to use PyTorch lightning!

Open williamFalcon opened this issue 3 years ago • 5 comments

Fixes #13 Fixes #11 Fixes #12 Fixes #8 Fixes #7

Huge fan, and amazing code to teach GPT!

I think the code can be more readable and easier to digest and understand what's happening by removing all the boilerpate via PyTorch Lightning.

In addition to removing the boilerplate (about 100+ lines), users get:

  • Adds half precision (apex + native)
  • Adds single gpu training
  • adds multi-gpu training
  • adds multi-tpu core training
  • adds automatic logging (tensorboard)
  • adds automatic checkpointing
  • adds 40+ features

Video of the final result and demo:

https://youtu.be/2aJFRQ-v6K8

williamFalcon avatar Aug 19 '20 16:08 williamFalcon

https://github.com/williamFalcon/minGPT has no trainer.py or train.py

aletote avatar Aug 21 '20 13:08 aletote

yup, happy to add it... but by adding lightning you don’t need the train.py file because you can copy paste the code from the jupyter into a script and run (see the attached video).

williamFalcon avatar Aug 21 '20 13:08 williamFalcon

By the way, I am still making my way through the docs/code but I actually like what Lightning is trying to do, and I think will try to incrementally restructure the code to meet its API. At that point it will be trivial to either use the included trainer object for full explicit flexibility, or just the lightning trainer. (basically I'm trying to make the use of Lightning optional and neatly "factored out", if possible)

karpathy avatar Aug 23 '20 18:08 karpathy

awesome! btw, the code internals are currently going through a refactor we started a few weeks ago in preparation of 1.0. Our goal is to get the internal to read as close as possible to a simple transparent loop. I agree that the mental shift from “my own loop” to the Lightning Trainer’s loop should be as close to 1:1 mapping as possible for pedagogical and understanding purposes.

Btw, below is a research-first repo (mostly RL and contrastive learning right now) that illustrates a ton of different use cases and implementations (but are all readable and standardized because of the lightning structure).

https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/master/pl_bolts/models/self_supervised/simclr/simclr_module.py

williamFalcon avatar Aug 23 '20 19:08 williamFalcon

cool, looking forward to 1.0!

karpathy avatar Aug 23 '20 20:08 karpathy