StableLM icon indicating copy to clipboard operation
StableLM copied to clipboard

Getting outofMemory error: CUDA

Open groundswel opened this issue 1 year ago • 3 comments

I get an error when I trying to use the model on a ml.g4dn.4xlarge instance.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 592.00 MiB (GPU 0; 14.62 GiB total capacity; 14.33 GiB already allocated; 175.94 MiB free; 14.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I am using the script in the README QuickStart section.

groundswel avatar Apr 20 '23 23:04 groundswel

what gpu do u have

srilokhkaruturi avatar Apr 20 '23 23:04 srilokhkaruturi

You need at least 27GB of GPU memory for the 7B parameter model (around 10GB for the 3B one)

dariocazzani avatar Apr 20 '23 23:04 dariocazzani

See #17

enricoros avatar Apr 21 '23 02:04 enricoros

You can load the model in 16-bit or 8-bit. If you know how to work with python, it shouldn't be too hard. If not, There are projects like https://github.com/oobabooga/text-generation-webui that can handle this for you.

The official notebook also has a load_in_8bit checkbox.

mcmonkey4eva avatar Apr 24 '23 17:04 mcmonkey4eva