lollms-webui How do I set a VRAM split with AutoGPTQ for multiple GPUs

How do I set a VRAM split with AutoGPTQ for multiple GPUs

Open MsJamie opened this issue 1 year ago • 0 comments

I'm running a pair of RTX 3060 cards but also an RTX 3060 and an RTX 3080 in my desktop and need to set 6,12 but the current max_gpu_mem_GB is only designed for a single card.

On my desktop trying to load a 13b model I see the following because it chooses to ignore the other GPU that has 12GB free.

Couldn't build model: [CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 9.77 GiB total capacity; 6.84 GiB already allocated; 61.44 MiB free; 7.05 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF]

Oh and not related to my question

When the icon for Twitter was updated. The Github and Youtube icons both have the same description "Visit repository page"

Aug 02 '23 06:08 MsJamie

lollms-webui lollms-webui copied to clipboard

How do I set a VRAM split with AutoGPTQ for multiple GPUs

lollms-webui
lollms-webui copied to clipboard