nanoGPT
nanoGPT copied to clipboard
Can my RTX3060 train the model code? There is no good graphics card
It can train a model, probably like 10-15m without OOM
I've trained GPT2 from scratch on single RTX 4070 16GB card with no issues. Takes a few days and you have to tune training configuration to make sure its small. Around 9GB VRAM with BatchSize=4 and GradAcc=4
model params train loss val loss
gpt2 124M 3.27 3.27