pytorch-StarGAN-VC icon indicating copy to clipboard operation
pytorch-StarGAN-VC copied to clipboard

About real-time inferencing

Open tranctan opened this issue 4 years ago • 0 comments

Hi, nice work there, I just have an additional question. The author of the paper indicated that the inference time can be coded in real-time.

But as far as I concern, for an arbitrary input audio of a source speaker, we need to decompose it into F0, ap, sp. This step somehow performs pretty slow depending on the length of the audio. For my case, an input audio with 14.6s would take nearly 5s of decomposing. This implicitly turns into a bottleneck for the whole inference phase.

Is there anything that I can do to implement real-time in inference ? Please let me know, thanks in advance and appreciate any helps :D.

tranctan avatar Apr 06 '20 09:04 tranctan