nanoGPT
nanoGPT copied to clipboard
Implement ROPE positional encodings
This PR includes the implementation of ROPE positional encodings and adjustments to the training script for the Shakespeare dataset.
Nice work. Curious how much loss did you get on this?