finetune-gpt2xl icon indicating copy to clipboard operation
finetune-gpt2xl copied to clipboard

Out of memory with RTX3090

Open PyxAI opened this issue 2 years ago • 4 comments

Hi, I'm trying to train gpt2xl, but keep getting OOM, even when I set batch size to 1 and gradient_accumulation to 8\16\512, contigous_gradients false and allgather_bucket_size \ reduce_bucket_size 2e2. I can see in nvidia-smi that I'm only reaching half the memory capacity - around 12GB My system is as stated - 3090 with 24GB memory 80 GB RAM 5600x cpu if that matters running WSL2 on windows 10 Thanks.

PyxAI avatar Jan 12 '22 22:01 PyxAI

So working with WSL is just a no-go I installed dual boot ubuntu and now the problem disappeared

PyxAI avatar Jan 14 '22 17:01 PyxAI

dual boot only, huh... that sucks. I was really hoping I could use this on win10 or wsl(2)

BrandonKoerner avatar May 10 '22 00:05 BrandonKoerner

I was however, able to run the model under WSL2 windows 11 Didn't check training, it's worth a shot @ReewassSquared

PyxAI avatar May 17 '22 17:05 PyxAI

Hi @PyxAI . Which ubuntu version did you run this code on?

uahmad235 avatar Sep 08 '22 09:09 uahmad235