CosyVoice
CosyVoice copied to clipboard
The inferrence speed is too slow
Describe the bug The inferrence speed is too slow
I tried both 4080 and 5090 on Linux, the rtf were nearly the same, both around 0.7. The GPU utilization is below 50%. For 5090, the power usage is only 100+W, which is too low.
I turned both trt and jit on. I use pytorch 2.7 with CUDA 12.8/9 installed.