Applio icon indicating copy to clipboard operation
Applio copied to clipboard

[RuntimeError]: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Open abujr101 opened this issue 1 year ago • 7 comments

Project Version

3.2.6

Platform and OS Version

windows 11

Affected Devices

Amd RX 580

Existing Issues

No response

What happened?

i am, on amd gpu . my gpu shader ISA is GFx803. I follow all the instrustion for amd gpu . but i am getting this error when inferencing

To create a public link, set share=True in launch(). Traceback (most recent call last): File "D:\Applio-3.2.6\env\lib\site-packages\gradio\queueing.py", line 536, in process_events response = await route_utils.call_process_api( File "D:\Applio-3.2.6\env\lib\site-packages\gradio\route_utils.py", line 321, in call_process_api output = await app.get_blocks().process_api( File "D:\Applio-3.2.6\env\lib\site-packages\gradio\blocks.py", line 1935, in process_api result = await self.call_function( File "D:\Applio-3.2.6\env\lib\site-packages\gradio\blocks.py", line 1520, in call_function prediction = await anyio.to_thread.run_sync( # type: ignore File "D:\Applio-3.2.6\env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "D:\Applio-3.2.6\env\lib\site-packages\anyio_backends_asyncio.py", line 2405, in run_sync_in_worker_thread return await future File "D:\Applio-3.2.6\env\lib\site-packages\anyio_backends_asyncio.py", line 914, in run result = context.run(func, *args) File "D:\Applio-3.2.6\env\lib\site-packages\gradio\utils.py", line 826, in wrapper response = f(*args, **kwargs) File "D:\Applio-3.2.6\core.py", line 192, in run_infer_script infer_pipeline.convert_audio( File "D:\Applio-3.2.6\rvc\infer\infer.py", line 254, in convert_audio self.get_vc(model_path, sid) File "D:\Applio-3.2.6\rvc\infer\infer.py", line 435, in get_vc self.setup_network() File "D:\Applio-3.2.6\rvc\infer\infer.py", line 487, in setup_network self.net_g.half() if self.config.is_half else self.net_g.float() File "D:\Applio-3.2.6\env\lib\site-packages\torch\nn\modules\module.py", line 1011, in half return self._apply(lambda t: t.half() if t.is_floating_point() else t) File "D:\Applio-3.2.6\env\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply module._apply(fn) File "D:\Applio-3.2.6\env\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply module._apply(fn) File "D:\Applio-3.2.6\env\lib\site-packages\torch\nn\modules\module.py", line 804, in _apply param_applied = fn(param) File "D:\Applio-3.2.6\env\lib\site-packages\torch\nn\modules\module.py", line 1011, in return self._apply(lambda t: t.half() if t.is_floating_point() else t) RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

abujr101 avatar Sep 30 '24 07:09 abujr101

Try downgrading to torch 2.2.1. RX580 are quote old and do not support FP16.

env\python -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118

patch Zluda after that

AznamirWoW avatar Sep 30 '24 08:09 AznamirWoW

same issue at rtx 4070ti super

Kirasabi avatar Oct 02 '24 17:10 Kirasabi

I have the same problem with the RX 6600 but it's on Windows 10

Jessegamer8k avatar Oct 04 '24 02:10 Jessegamer8k

yeah rx 7800 too should buy nvidia ;/

fridolie avatar Oct 07 '24 17:10 fridolie

yeah rx 7800 too should buy nvidia ;/

you probably did not follow the readme

AznamirWoW avatar Oct 07 '24 17:10 AznamirWoW

Im beginner with this so sorry if i dont undertand some shit.

Just follow the readme

AznamirWoW avatar Oct 08 '24 14:10 AznamirWoW

One possibility is that the pip install does not replace the torch cu121. You should've seen that while trying to install it.

in that case run env\python -m pip uninstall torch torchvision torchaudio prior to installing torch cu118

AznamirWoW avatar Oct 11 '24 21:10 AznamirWoW

I was able to reproduce a similar error inferring a long audio

blaisewf avatar Oct 12 '24 18:10 blaisewf

The fix for "RuntimeError: CUDA error: operation not supported CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect." is uninstalling cu121 torch and re-installing cu118

Zluda does not work with cu121. I've updated the installation page to include uninstall script.

AznamirWoW avatar Oct 12 '24 18:10 AznamirWoW

I was getting this error. What fixed it for me was adding the rocm bin address to Path under system variables. Even though rocm is already there as HIP it needed to be added to path for me to get it to work.

veryscaryone avatar Oct 24 '24 19:10 veryscaryone