nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

Can my RTX3060 train the model code? There is no good graphics card

Open super117893344p opened this issue 1 year ago • 2 comments

super117893344p avatar Feb 18 '24 13:02 super117893344p

It can train a model, probably like 10-15m without OOM

VatsaDev avatar Feb 18 '24 23:02 VatsaDev

I've trained GPT2 from scratch on single RTX 4070 16GB card with no issues. Takes a few days and you have to tune training configuration to make sure its small. Around 9GB VRAM with BatchSize=4 and GradAcc=4

model	      params	train loss	val loss
gpt2           124M      3.27             3.27

bigsnarfdude avatar Mar 06 '24 14:03 bigsnarfdude