text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

Specify which GPU to use

Open bradstcroix opened this issue 1 year ago • 3 comments

Hey Ooba,

I have a system with 2 GPU's. Even when not using the --auto-devices flag, it still splits the model 50/50 between the two GPU's causing very slow performance.

Is there a way to specify which GPU to use for loading the model?

Bonus question - Is there a way to specify which GPU to use for the LLM, while still keeping the 2nd GPU available for the extensions (in my case BarkTTS)

Thanks!

bradstcroix avatar May 10 '23 07:05 bradstcroix

Thanks yeah I gave that a go by calling it early in server.py and it works as far as limiting everything to one GPU. But I'm guessing it must then limit the whole session to that GPU including any extensions. I wasn't able to get the Bark extension to use the other GPU by setting CUDA_VISIBLE_DEVICES in the extension's script.py.

bradstcroix avatar May 11 '23 23:05 bradstcroix

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions[bot] avatar Jun 11 '23 23:06 github-actions[bot]