nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

why batch size = 480 instead of 512 as in the GPT-2 paper?

Open shehper opened this issue 9 months ago • 0 comments

Hi! The batch size of nanoGPT is batch_sizegradient_accumulation_steps = 1240 = 480. The batch size mentioned in the GPT-2 paper is 512. May I ask why nanoGPT was trained with a slightly smaller batch size?

Reference: Page 4 on Language Models are Unsupervised Multitask Learners

shehper avatar Sep 28 '23 02:09 shehper