lollms-webui icon indicating copy to clipboard operation
lollms-webui copied to clipboard

How do I set a VRAM split with AutoGPTQ for multiple GPUs

Open MsJamie opened this issue 1 year ago • 0 comments

I'm running a pair of RTX 3060 cards but also an RTX 3060 and an RTX 3080 in my desktop and need to set 6,12 but the current max_gpu_mem_GB is only designed for a single card.

On my desktop trying to load a 13b model I see the following because it chooses to ignore the other GPU that has 12GB free.

Couldn't build model: [CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 9.77 GiB total capacity; 6.84 GiB already allocated; 61.44 MiB free; 7.05 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF]

Oh and not related to my question

When the icon for Twitter was updated. The Github and Youtube icons both have the same description "Visit repository page"

MsJamie avatar Aug 02 '23 06:08 MsJamie