nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

Just a question

Open jpbruneton opened this issue 3 years ago • 2 comments

If I understand correctly, you have max 600000 iterations times batches of 12, which is roughly 7M training examples fed to the transformer, way smaller than the 9B tokens of the training set. I certainly am missing something? Thanks

jpbruneton avatar Jan 19 '23 11:01 jpbruneton

each batch has 12 * 1024 tokens, because 1024 is the block size. All of those tokens get trained on in parallel.

karpathy avatar Jan 19 '23 16:01 karpathy

good thankyou

layerkugou avatar Jan 25 '23 09:01 layerkugou