voice-changer
voice-changer copied to clipboard
Severe Performance degredation in newest version for GPU ONNX Inference with RVC models
The dropdown to select the OnnxExecutionProvider is gone. ONNX inference defaults to CPU (RVC Model). I can no longer use this as CPU inference is too slow
I could make the dropdown apear again by modifying the RVC.json
in dist/assets.
I added;
{ "name": "framework", "options": {"showFramework": true} },
But now it only shows error when changing Provider:
[Voice Changer] update configuration: onnxExecutionProvider CUDAExecutionProvider
onnxExecutionProvider is not mutable variable or unknown variable!
So I assume this feature got completly removed on purpose?
Or is there some binding/packaging error that caused this?
Yes, on purpose. If you select GPU >= 0, gpu is used. if GPU <0, cpu is used. which value is your GPU?
Ah ok. I think there is an other issue then my GPU is set to 0 but it is only using CPU when i use an ONNX model. seems to work fine when using non ONNX. I will see if I can gather more information.
So after some additional testing I can say that there seems to be an issue with GPU and ONNX in Version MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2. Either it is not running on GPU at all even when choosing GPU 0. or there is some massive CPU overhead. I ran the tests multiple times and switched between GPU -1 and 0 a few times but the results stay the same. I dont know if it has something to do with my machine or it is always the case. Here are some results. I played an audio file in VLC and routed it via virtual audio cable to the Voice Conversion App using Client Device. This way the same audio is used for all Test. This are my settings and results using the only old version i still have (MMVCServerSIO_win_onnxgpu-cuda_v.1.5.2.6a) and the newest one (MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2). I am running it using RTX 4090 and Ryzen 5800X3D
Settings for Test with MMVCServerSIO_win_onnxgpu-cuda_v.1.5.2.6a:
Setting | Value | Notes |
---|---|---|
buf | 1365ms | High Value to use same for CPU and GPU Tests |
Input Chunk Num | 512 | - |
Extra Data Length | 65536 | - |
Cross Fade Overlap Size | 4096 | - |
RVC Quality | High | - |
Results for pytorch on GPU:
Metric | Value | Notes |
---|---|---|
res | ~200ms | As expected low latency on GPU |
CPU Usage | ~30% | - |
GPU (Cuda)Usage | ~7% | - |
GPU (3D) Usage | ~3% | - |
GPU (VRAM) Usage (Total System) | 3,1/24GB | - |
Results for ONNX with CPUExecutionProvider
Metric | Value | Notes |
---|---|---|
res | ~1060ms | As expected higher latency on cpu |
CPU Usage | ~30% | - |
GPU (Cuda)Usage | ~7% | - |
GPU (3D) Usage | ~3% | - |
GPU (VRAM) Usage (Total System) | 3,1/24GB | - |
Results for ONNX with GPUExecutionProvider
Metric | Value | Notes |
---|---|---|
res | ~172ms | Even lower than pytorch! |
CPU Usage | ~30% | - |
GPU (Cuda)Usage | ~2% | Lower usage than pytorch! |
GPU (3D) Usage | ~2% | - |
GPU (VRAM) Usage (Total System) | 3,1/24GB | - |
Settings for MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.2:
Setting | Value | Notes |
---|---|---|
buf | 2560ms | Needed to be increased (See ONNX GPU) |
Input Chunk Num | 960 | - |
Extra Data Length | 65536 | - |
Cross Fade Overlap Size | 4096 | - |
RVC Quality | High | - |
Results for pytorch on GPU:
Metric | Value | Notes |
---|---|---|
res | ~270ms | Higher but could be caused by bigger buf |
CPU Usage | ~30% | - |
GPU (Cuda)Usage | ~2% | - |
GPU (3D) Usage | ~2% | - |
GPU (VRAM) Usage (Total System) | 3,0/24GB | - |
Results for ONNX on CPU (setting GPU to -1):
Metric | Value | Notes |
---|---|---|
res | ~1820ms | Much Higher but could still be caused by bigger buf |
CPU Usage | ~55% | almost double cpu usage! |
GPU (Cuda)Usage | ~0% | - |
GPU (3D) Usage | ~2% | - |
GPU (VRAM) Usage (Total System) | 3,0/24GB | - |
Results for ONNX on GPU (setting GPU to 0):
Metric | Value | Notes |
---|---|---|
res | ~2060ms | slower than CPU and >10 x slower than old version |
CPU Usage | ~60% | Seems to still run on CPU |
GPU (Cuda)Usage | ~0% | No Cuda usage |
GPU (3D) Usage | ~2% | - |
GPU (VRAM) Usage (Total System) | 3,0/24GB | - |
That's certainly strange. First of all, your gpu is detected by the software?
VoiceChanger Initialized (GPU_NUM:1, mps_enabled:False)
And, I guess your answer is yes. Because the behavior chaged when set gpu value between 0, -1.
But I think behavior changed means configuration is changed too. In other word Excutor is changed but slow. strange...
@w-okada yes the gpu is detected. I will try to setup the dev environment in the coming days and see if that will give more clues. It probably has something to do with my setup/environment as other people reported that ONNX+GPU is still working.
I will report back if I find anything useful.
close.