voice-changer icon indicating copy to clipboard operation
voice-changer copied to clipboard

Severe Performance degredation in newest version for GPU ONNX Inference with RVC models

Open SirBitesalot opened this issue 1 year ago • 6 comments

The dropdown to select the OnnxExecutionProvider is gone. ONNX inference defaults to CPU (RVC Model). I can no longer use this as CPU inference is too slow

SirBitesalot avatar May 22 '23 15:05 SirBitesalot

I could make the dropdown apear again by modifying the RVC.json in dist/assets. I added; { "name": "framework", "options": {"showFramework": true} }, But now it only shows error when changing Provider:

[Voice Changer] update configuration: onnxExecutionProvider CUDAExecutionProvider
onnxExecutionProvider is not mutable variable or unknown variable!

grafik So I assume this feature got completly removed on purpose? Or is there some binding/packaging error that caused this?

SirBitesalot avatar May 22 '23 17:05 SirBitesalot

Yes, on purpose. If you select GPU >= 0, gpu is used. if GPU <0, cpu is used. which value is your GPU?

w-okada avatar May 22 '23 21:05 w-okada

Ah ok. I think there is an other issue then my GPU is set to 0 but it is only using CPU when i use an ONNX model. seems to work fine when using non ONNX. I will see if I can gather more information.

SirBitesalot avatar May 23 '23 04:05 SirBitesalot

So after some additional testing I can say that there seems to be an issue with GPU and ONNX in Version MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2. Either it is not running on GPU at all even when choosing GPU 0. or there is some massive CPU overhead. I ran the tests multiple times and switched between GPU -1 and 0 a few times but the results stay the same. I dont know if it has something to do with my machine or it is always the case. Here are some results. I played an audio file in VLC and routed it via virtual audio cable to the Voice Conversion App using Client Device. This way the same audio is used for all Test. This are my settings and results using the only old version i still have (MMVCServerSIO_win_onnxgpu-cuda_v.1.5.2.6a) and the newest one (MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2). I am running it using RTX 4090 and Ryzen 5800X3D

Settings for Test with MMVCServerSIO_win_onnxgpu-cuda_v.1.5.2.6a:

Setting Value Notes
buf 1365ms High Value to use same for CPU and GPU Tests
Input Chunk Num 512 -
Extra Data Length 65536 -
Cross Fade Overlap Size 4096 -
RVC Quality High -

Results for pytorch on GPU:

Metric Value Notes
res ~200ms As expected low latency on GPU
CPU Usage ~30% -
GPU (Cuda)Usage ~7% -
GPU (3D) Usage ~3% -
GPU (VRAM) Usage (Total System) 3,1/24GB -

Results for ONNX with CPUExecutionProvider

Metric Value Notes
res ~1060ms As expected higher latency on cpu
CPU Usage ~30% -
GPU (Cuda)Usage ~7% -
GPU (3D) Usage ~3% -
GPU (VRAM) Usage (Total System) 3,1/24GB -

Results for ONNX with GPUExecutionProvider

Metric Value Notes
res ~172ms Even lower than pytorch!
CPU Usage ~30% -
GPU (Cuda)Usage ~2% Lower usage than pytorch!
GPU (3D) Usage ~2% -
GPU (VRAM) Usage (Total System) 3,1/24GB -

Settings for MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2:

Setting Value Notes
buf 2560ms Needed to be increased (See ONNX GPU)
Input Chunk Num 960 -
Extra Data Length 65536 -
Cross Fade Overlap Size 4096 -
RVC Quality High -

Results for pytorch on GPU:

Metric Value Notes
res ~270ms Higher but could be caused by bigger buf
CPU Usage ~30% -
GPU (Cuda)Usage ~2% -
GPU (3D) Usage ~2% -
GPU (VRAM) Usage (Total System) 3,0/24GB -

Results for ONNX on CPU (setting GPU to -1):

Metric Value Notes
res ~1820ms Much Higher but could still be caused by bigger buf
CPU Usage ~55% almost double cpu usage!
GPU (Cuda)Usage ~0% -
GPU (3D) Usage ~2% -
GPU (VRAM) Usage (Total System) 3,0/24GB -

Results for ONNX on GPU (setting GPU to 0):

Metric Value Notes
res ~2060ms slower than CPU and >10 x slower than old version
CPU Usage ~60% Seems to still run on CPU
GPU (Cuda)Usage ~0% No Cuda usage
GPU (3D) Usage ~2% -
GPU (VRAM) Usage (Total System) 3,0/24GB -

SirBitesalot avatar May 24 '23 12:05 SirBitesalot

That's certainly strange. First of all, your gpu is detected by the software?

VoiceChanger Initialized (GPU_NUM:1, mps_enabled:False)

And, I guess your answer is yes. Because the behavior chaged when set gpu value between 0, -1.

But I think behavior changed means configuration is changed too. In other word Excutor is changed but slow. strange...

w-okada avatar May 24 '23 21:05 w-okada

@w-okada yes the gpu is detected. I will try to setup the dev environment in the coming days and see if that will give more clues. It probably has something to do with my setup/environment as other people reported that ONNX+GPU is still working.

I will report back if I find anything useful.

SirBitesalot avatar May 24 '23 22:05 SirBitesalot

close.

w-okada avatar Jun 04 '23 10:06 w-okada