xtts-webui
xtts-webui copied to clipboard
DeepSpeed isn't optional
Even though DeepSpeed has an argument, you cannot run the program without it being installed. Even after removing the DeepSpeed imports it still wants to use DeepSpeed stuff, so you can never make it to the UI.
This means ROCm users are left out. I think DeepSpeed support may have been added in ROCm 6.0, but it's definitely not in ROCm 5.7 or below. The rest of the program likely works fine, as xtts-api and other projects work fine. It's just DeepSpeed which is preventing it from working.
If you could please make it so that DeepSpeed only gets imported when its needed, and also only if --deepspeed is used, I would be very grateful.
Hey mate!
I've created an easy docker image to run this UI (and others) really easily with an AMD GPU under Linux. https://github.com/M4TH1EU/ai-suite-rocm/tree/main
I haven't been able to make deepspeed work though, some errors during compilation. But personally, it works wonders without it.
++
I found a workaround to launch xtts-webui without deepspeed. First I uninstall deepspeed.
pip uninstall deepspeed
Then I launch it with this command.
python app.py
I also want to give out a tip for any AMD user trying to fine tune a model. Since xtts-webui uses faster whisper and since that only works on Nvidia graphics card, you should create a folder in the finetuned_models. In this case, mine was called Hiccup. I create a dataset folder inside the Hiccup folder. Inside the dataset folder is a wavs folder containing wavs files. In the dataset folder, you should create a lang.txt, metadata_eval.csv and metadata_train.csv files. Here are the files showing you how they should be formated. lang.txt metadata_eval.csv metadata_train.csv Launch xtts-webui. Create a fine tune model with the same name as the one in the finetuned_models directory, in this case mine was called Hiccup. I placed just 1 wav file in the "drop file here" box. I then click on load params from output folder button. Then I clicked train.
Edit: If you get this error message:
FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models/Hiccup/ready/reference.wav'
Then you need to copy a wav file from the wavs folder and put it in the ready folder. Rename the wav file to reference. Click on load params from output and then click on train.
I went to the funcs.py file in
xtts-webui/scripts/funcs.py
On line 4 i commented out the following
"from scripts.resemble_enhance.enhancer.inference import denoise, enhance"
Then just in case someone in the file was looking for anything i added before the first function
enhance=None
System Specs
- System76 Pop_OS Linux firmware Laptop
- Intel CPU
- Nvidia 4090 16gb