nanoGPT
nanoGPT copied to clipboard
Just a question
If I understand correctly, you have max 600000 iterations times batches of 12, which is roughly 7M training examples fed to the transformer, way smaller than the 9B tokens of the training set. I certainly am missing something? Thanks