bark Multy GPU Support?

Dear Developer,

Thank you for your fascinating contribution. This tool is indeed amazing. It works fine on my machine.

However, I have 2 GTX 4090 and implied the LLAMA2 70 4Q. As a result, there isn't much memory for a single GPU left. The current solution is to run coarse_use_gpu only because the perimeter of preload only supports True/False, rather than the ID of the GPU.

If there's a way to load different parts in different GPUs, it would solve this problem perfectly. Do you know if there's a way to do it, or will you update this feature in the future?

Thanks.

Oct 15 '23 14:10 Karobben

There is a way to do it, but it would require some lite code changes to the model handling.

However if you are running coarse_use_gpu only, and cpu on the other models, just to save VRAM then you actually simply need to turn on CPU offloading. As long as the text model fits in VRAM then it will load and unload the 3 Bark models as needed, so it only takes up as much space as the text model (simply because it happens to be the biggest.)

Oct 19 '23 02:10 JonathanFly

There is a way to do it, but it would require some lite code changes to the model handling.

However if you are running coarse_use_gpu only, and cpu on the other models, just to save VRAM then you actually simply need to turn on CPU offloading. As long as the text model fits in VRAM then it will load and unload the 3 Bark models as needed, so it only takes up as much space as the text model (simply because it happens to be the biggest.)

Thanks for your reply. I didn't understand the part about the CPU offloading.

Oct 20 '23 21:10 Karobben

bark bark copied to clipboard

Multy GPU Support?

bark
bark copied to clipboard