LocalAI
LocalAI copied to clipboard
gpu + transformers-musicgen
it's been reported in discord that transformers-musicgen needs some changes to fully support GPU
- https://github.com/mudler/LocalAI/blob/master/backend/python/transformers/backend.py
- It's already implemented in transformers
- https://github.com/huggingface/transformers/issues/2704
- this may be useful reference if anything needs tweaking?
Hi @dave-gray101,
sadly I don't have the time to work on it right now. But I can leave some info here for someone with time and good will 😄
Changes needed are:
In the LoadModel, SoundGeneration and TTS add
self.CUDA = torch.cuda.is_available()
if self.CUDA:
if request.MainGPU:
device_map=request.MainGPU
else:
device_map="cuda:0"
default to cuda:0 if user does not specify the GPU in the config file.
Don't know why the model is loaded every time an inference is requested, probably here we can optimize.
Modify the call to from_pretrained adding the device_map
self.processor = AutoProcessor.from_pretrained(model_name, device_map=device_map)
self.model = MusicgenForConditionalGeneration.from_pretrained(model_name, device_map=device_map)
inputs must be on the same device used for inference:
if self.CUDA:
inputs = inputs.to("cuda")
To be added before the generate call in SoundGeneration and TTS
This should be enough.
Thanks a ton!
Once I finish up a handful of infra fixes, I'll take a stab at this.
Now that transformers already supports different ways to LoadModel anyway... it may be worth re-considering merging transformers-musicgen into transformers proper
Sure! We can implement other type: like musicgen, diffusers, and so on.
I think that this will help also in reducing building time and final docker image size.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.