gpt-2 icon indicating copy to clipboard operation
gpt-2 copied to clipboard

Out of Memory Error

Open slavik0329 opened this issue 5 years ago • 6 comments

I get the below errors when running gpt-2. The model runs in the end and seems to work, but is there any way to fix this?

Thanks!

2019-11-19 17:13:26.908876: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to alloc 4294967296 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of m
emory

2019-11-19 17:13:26.913409: W .\tensorflow/core/common_runtime/gpu/gpu_host_allocator.h:44] could not allocate pinned host memory of size: 4294967296

2019-11-19 17:13:26.917233: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to alloc 3865470464 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of m
emory

2019-11-19 17:13:26.921787: W .\tensorflow/core/common_runtime/gpu/gpu_host_allocator.h:44] could not allocate pinned host memory of size: 3865470464
2019-11-19 17:13:26.925608: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to alloc 3478923264 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of m
emory

2019-11-19 17:13:26.930056: W .\tensorflow/core/common_runtime/gpu/gpu_host_allocator.h:44] could not allocate pinned host memory of size: 3478923264

slavik0329 avatar Nov 19 '19 22:11 slavik0329

You need more GPU memory.

Lower your batch size or, if it's already 1, you'll have to use CPU / switch to a smaller model.

LoganDark avatar Nov 20 '19 16:11 LoganDark

I’m using a GTX 2080ti. Does that make sense with this card?

On Wed, Nov 20, 2019 at 11:02 AM LoganDark [email protected] wrote:

You need more GPU memory.

Lower your batch size or, if it's already 1, you'll have to use CPU / switch to a smaller model.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/gpt-2/issues/213?email_source=notifications&email_token=AAC4SBA4XAT2H2N3T7P5Y6TQUVNRVA5CNFSM4JPKIABKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEESQE6A#issuecomment-556073592, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4SBGN3FKU5LLWMHRZZQ3QUVNRVANCNFSM4JPKIABA .

slavik0329 avatar Nov 20 '19 16:11 slavik0329

In my opinion you should really be using a workstation card rather than a gaming card.

Personally I've always had to use CPU since I haven't been graced with the presence of the Nvidia Gods and my laptop has an AMD GPU. Dedicated, but it's AMD so TF doesn't like it.

CPU training works just fine (I get about one iteration every 30 seconds), it's just slower and tends to bring down the rest of your system too. :/

LoganDark avatar Nov 20 '19 16:11 LoganDark

I know this is old, but I want to point out that you can run GPT-2 interference on a GTX 2080ti in Linux... but not Windows. The reason for this is.. Windows reserves a portion of the VRAM for displaying things in the WDDM 2 driver /even if you have no monitors hooked up to it/. This is out of your control.. and you can't switch to the compute only drivers on gaming cards. On the 2080ti, it ends up being ~1GB of VRAM. On Linux, a much smaller amount of VRAM is reserved depending on what you actually have running. You can barely fit the largest GPT-2 Model in an 11GB Card.. so that 1GB of reserved VRAM in Windows puts it over the edge.

Teravus avatar Jul 29 '20 03:07 Teravus

Is it possible to somehow increase GPU memory? There is a lot of free RAM! Trying to train a model 1558M on the video card RTX2080TI, but memory errors come out!

sowich avatar Dec 10 '20 07:12 sowich

@sowich GPU memory is built in to your GPU, and can't be upgraded. If you need more, your only options are to purchase a GPU with more memory, or purchase a second GPU, identical to your existing GPU, and run them both in SLI (assuming that your pc is SLI capable). Your RAM is used your CPU. If you run on your CPU instead of your GPU, you'll likely see a large spike in your RAM usage.

Wyldhunt avatar Dec 21 '20 11:12 Wyldhunt