nanoGPT
nanoGPT copied to clipboard
why batch size = 480 instead of 512 as in the GPT-2 paper?
Hi! The batch size of nanoGPT is batch_sizegradient_accumulation_steps = 1240 = 480. The batch size mentioned in the GPT-2 paper is 512. May I ask why nanoGPT was trained with a slightly smaller batch size?
Reference: Page 4 on Language Models are Unsupervised Multitask Learners