text-generation-webui Anyone run it in Collab with llama?

Anyone run it in Collab with llama?

Open mkygogo opened this issue 1 year ago • 4 comments

Mar 06 '23 02:03 mkygogo

https://colab.research.google.com/drive/1UJnzwI6uCScDFjmHNGQ8LZcmRT70mgwU?usp=sharing try this

Mar 06 '23 05:03 legekka

It's awesome that even the 13B model can be run in Colab, however, the context window is pretty limited, I get OutOfMemoryError at 314 words.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 14.75 GiB total capacity; 13.50 GiB already allocated; 18.81 MiB free; 13.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Mar 07 '23 00:03 ArakiSatoshi

I'm not able to run this. It runs out of memory and dies when running out of memory loading the model. The collab Instance I'm using has 12.7 GB of RAM.

Is there a way to shrink ram usage by having a 4bit version? Or do I just use Collab Pro or import an instance from GCP?

Update: Tried the Pro version of Collab with 25.5 GB of RAM and ran into the same issue on startup.

Mar 11 '23 23:03 Eyon42

Is this a quantized 4 bit version or the original?

Mar 14 '23 05:03 samrahimi

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Apr 13 '23 16:04 github-actions[bot]

text-generation-webui text-generation-webui copied to clipboard

Anyone run it in Collab with llama?

text-generation-webui
text-generation-webui copied to clipboard