tortoise-tts icon indicating copy to clipboard operation
tortoise-tts copied to clipboard

does this work with amd

Open s-b-repo opened this issue 1 year ago • 13 comments

python tortoise/do_tts.py --text "hi" --voice lolitest --preset fast
Traceback (most recent call last): File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 172, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: libcublas.so.11: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/run/media/cunny/437cec73-3450-44be-9f57-95eb54003e1e/tortoise-tts-main/tortoise-tts-main/tortoise/do_tts.py", line 4, in import torch File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 217, in _load_global_deps() File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 178, in _load_global_deps _preload_cuda_deps() File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 158, in _preload_cuda_deps ctypes.CDLL(cublas_path) File "/usr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: /usr/lib/python3.10/site-packages/nvidia_cuda_runtime_cu11-11.7.99-py3.10-linux-x86_64.egg/nvidia/cublas/lib/libcublas.so.11: cannot open shared object file: No such file or directory

s-b-repo avatar Mar 19 '23 02:03 s-b-repo

Been running fine for me with rocm. Try this:

pip uninstall torch torchaudio -y
pip install torch torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

michaelnew avatar Mar 31 '23 19:03 michaelnew

@michaelnew It's great to hear it works with ROCm!

Do you mind to share a bit more how did you install it? Which AMD GPU precisely do you have?

I ask because when I try to run it - it tries to do something with my GPU (I see SCLK going up in rocm-smi), but after a while I got a segmentation fault:

python tortoise/do_tts.py --text "hi" --voice random --preset fast
Fatal Python error: Segmentation fault

Current thread 0x00007fdd9dc2a300 (most recent call first):
  File "/home/piotr/repos/tortoise-tts/tortoise/api.py", line 390 in tts
  File "/home/piotr/repos/tortoise-tts/tortoise/api.py", line 331 in tts_with_preset
  File "tortoise/do_tts.py", line 37 in <module>
[1]    43982 segmentation fault (core dumped)  python tortoise/do_tts.py --text "hi" --voice random --preset fast

Would be super nice to figure it out what exactly needs to be done to run it on AMD GPUs, and then we can update the README if it works well :)

pciazynski avatar Apr 14 '23 15:04 pciazynski

Oh, I've found the solution: https://github.com/RadeonOpenCompute/ROCm/issues/1698

It works for me on my AMD Radeon RX 6700 XT with the following env variable HSA_OVERRIDE_GFX_VERSION=10.3.0, for example:

HSA_OVERRIDE_GFX_VERSION=10.3.0 python tortoise/do_tts.py --text "hi" --voice random --preset fast

pciazynski avatar Apr 14 '23 16:04 pciazynski

@pciazynski nice, glad you got it working. I'm using a Radeon VII and I didn't run into that issue. What I posted above was all I had to do, at least from what I recall.

As a side note though, I did have to fiddle around with requirements.txt quite a bit to get it to install cleanly into a new virtual environment, so I'll share that here if anyone needs it:

tqdm
rotary_embedding_torch
transformers==4.19
tokenizers
inflect
progressbar
einops==0.4.1
unidecode
scipy==1.10.1
librosa==0.9.1
ffmpeg
threadpoolctl
appdirs
--extra-index-url https://download.pytorch.org/whl/rocm5.4.2
torchaudio
--extra-index-url https://download.pytorch.org/whl/rocm5.4.2
torch

michaelnew avatar Apr 14 '23 20:04 michaelnew

broke my linux install

s-b-repo avatar Apr 17 '23 21:04 s-b-repo

thanks a lot for this thread, I'm trying to run this on AMD Radeon Pro 580X 8 GB (Mac Pro). I use @michaelnew requirements file from above, it seems it is working but I get this warning:

torch/amp/autocast_mode.py:204: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')

and I see that GPU is not actively working, do you get the same warning?

karabaralex avatar Aug 03 '23 01:08 karabaralex

@karabaralex no, that warning almost certainly means you aren't running on the GPU. For me it's about a 50x speedup on the GPU vs CPU, so it should be pretty obvious too when you run inference.

You can check for cuda in the python interpreter with this:

import torch
torch.cuda.is_available() # should return True

Just make sure you're using python from your virtual environment (if you're using one) rather than system python.

michaelnew avatar Aug 03 '23 18:08 michaelnew

For AMD you have to look at rcom till I know cuda only works for nvidia.

manmay-nakhashi avatar Aug 04 '23 03:08 manmay-nakhashi

It is rocm. PyTorch just refers to it as cuda.

michaelnew avatar Aug 04 '23 05:08 michaelnew

I cannot get this to work for the life of me. I have a 6700XT like @pciazynski , and using HSA_OVERRIDE_GFX_VERSION=10.3.0 got me through my segmentation fault bug, but I have run into a new problem. Upon running my command I am presented with (tortoise) [ddadude@Spire tortoise-tts]$ HSA_OVERRIDE_GFX_VERSION=10.3.0 python tortoise/do_tts.py --text "hi" --voice random --preset fast Generating autoregressive samples.. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:07<00:00, 1.69it/s] Computing best candidates using CLVP 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:02<00:00, 5.93it/s] Transforming autoregressive outputs into audio.. MIOpen(HIP): Error [Compile] 'hiprtcCompileProgram(prog.get(), c_options.size(), c_options.data())' naive_conv.cpp: HIPRTC_ERROR_COMPILATION (6) MIOpen(HIP): Error [BuildHip] HIPRTC status = HIPRTC_ERROR_COMPILATION (6), source file: naive_conv.cpp MIOpen(HIP): Warning [BuildHip] /tmp/comgr-7e62c6/input/naive_conv.cpp:39:10: fatal error: 'limits' file not found #include <limits> // std::numeric_limits ^~~~~~~~ 1 error generated when compiling for gfx1030. terminate called after throwing an instance of 'miopen::Exception' what(): /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_program.cpp:304: Code object build failed. Source: naive_conv.cpp Aborted (core dumped)

Specs: CPU: Ryzen 7 3800X 16GB RAM GPU: AMD Radeon RX 6700XT OS: Linux Mint 21.1

I know very little about ROCm, HIP, and AMDGPU in general, and so I fear I am missing something very obvious here, especially because reading the errors I think I am just missing a library, but I have no idea where to get it or how to install it, and AMD docs have led me in circles. Any help is greatly appreciated, thank you all.

thatguy4194 avatar Oct 31 '23 01:10 thatguy4194

An update: The version of ROCm I was using was improper. I fixed this by switching to pytorch and torchaudio ROCm 6.1.

thatguy4194 avatar Nov 01 '23 00:11 thatguy4194

This is working for me but I'm only getting 6.1it/s (half the speed of a macbook laptop).

claydegruchy avatar Nov 03 '23 11:11 claydegruchy

@claydegruchy please check to make sure it's actually running on the gpu. I ran into this issue and realized I was using a version of torch that didn't support ROCm. Check gpu usage using the radeontop tool

fakerybakery avatar Nov 08 '23 00:11 fakerybakery