gpt-2-simple icon indicating copy to clipboard operation
gpt-2-simple copied to clipboard

Out of memory (OOM) ResourceExhaustedError for 1558m model

Open Franceshe opened this issue 4 years ago • 3 comments

Background:

Given the colab of gpt-2-simple, https://colab.research.google.com/drive/1VLG8e7YSEwypxU-noRNhsv5dW4NfTGce, I try to download the 1556m version of gpt2 model and generate text on it. After run the following cell gpt2.generate_to_file(sess, destination_path=gen_file_txt, model_name=model_name, prefix="Are we live in the simulation", length=200, temperature=0.9, nsamples=100, batch_size=20 )

It gives some weird error like following: ResourceExhaustedError: 2 root error(s) found. (0) Resource exhausted: OOM when allocating tensor with shape[20,48,2,25,201,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

My suspection

I suspect it is an Out of memory (OOM) error, also ready ask colab to give me 25gb memory after taking suggestion from #https://towardsdatascience.com/upgrade-your-memory-on-google-colab-for-free-1b8b18e8791d Hi, does anyone run into the following error? Or anyone had run the 1558m gpt2 model from colab, and how was your set up? The smaller model(774m) works fine, i just wander if it is feasible to run 1558m on colab also given the RAM constraint. If it is feasible, what hyperparameters should i tune, maybe batchsize? Thanks!

Franceshe avatar Mar 15 '20 15:03 Franceshe

It's not RAM, its GPU.

1558 or 1.5B model will run only on >=P100 GPU with 16GB VRAM.

designgrande avatar Mar 16 '20 06:03 designgrande

It's not RAM, its GPU.

1558 or 1.5B model will run only on >=P100 GPU with 16GB VRAM.

Oh i see, I was using colab's T4 with 16GB memory size or K80 with 12GB.

Franceshe avatar Mar 16 '20 10:03 Franceshe

It's not RAM, its GPU.

1558 or 1.5B model will run only on >=P100 GPU with 16GB VRAM.

FYI: running on a Colab Pro P100 I was unable to train the 1.5B model (the 774M works ok)

timohear avatar Apr 19 '20 16:04 timohear