gpt2-pytorch
gpt2-pytorch copied to clipboard
Extremely simple and understandable GPT2 implementation with minor tweaks
GPT2 Pytorch
Extremely simple and understandable GPT2 implementation with minor tweaks.
Advantages
- You can train even the subword tokenizer, good for non-English languages.
- Fast optimized code, enough a single GTX 2080ti card
- Easy to understand, solid code
- Easy to extend for new experiments
Supported extra features
- Lamb optimizer
- Mixed precision training, the important layers still remained in fp32.
- sin, cos positional encoding