text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

Anyone run it in Collab with llama?

Open mkygogo opened this issue 1 year ago • 4 comments

mkygogo avatar Mar 06 '23 02:03 mkygogo

https://colab.research.google.com/drive/1UJnzwI6uCScDFjmHNGQ8LZcmRT70mgwU?usp=sharing try this

legekka avatar Mar 06 '23 05:03 legekka

It's awesome that even the 13B model can be run in Colab, however, the context window is pretty limited, I get OutOfMemoryError at 314 words.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 14.75 GiB total capacity; 13.50 GiB already allocated; 18.81 MiB free; 13.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

ArakiSatoshi avatar Mar 07 '23 00:03 ArakiSatoshi

I'm not able to run this. It runs out of memory and dies when running out of memory loading the model. The collab Instance I'm using has 12.7 GB of RAM.

Is there a way to shrink ram usage by having a 4bit version? Or do I just use Collab Pro or import an instance from GCP?

Update: Tried the Pro version of Collab with 25.5 GB of RAM and ran into the same issue on startup.

Eyon42 avatar Mar 11 '23 23:03 Eyon42

Is this a quantized 4 bit version or the original?

samrahimi avatar Mar 14 '23 05:03 samrahimi

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions[bot] avatar Apr 13 '23 16:04 github-actions[bot]