audio icon indicating copy to clipboard operation
audio copied to clipboard

Add xfolding to tacotron2 infer pipeline

Open mthrok opened this issue 4 years ago • 2 comments

In case of vocoding one example, by folding the input example into batch of chunks, the inference can run faster.

https://github.com/pytorch/audio/blob/31dbb7540c78fe5d176948764cf9a20f55ac80dc/examples/pipeline_wavernn/wavernn_inference_wrapper.py#L167-L177

I excluded it from the initial tacotron2 pipeline, due to the https://github.com/pytorch/audio/issues/1742 we can re-implement this while resolving why #1742 was the case.

https://github.com/pytorch/audio/blob/31dbb7540c78fe5d176948764cf9a20f55ac80dc/examples/pipeline_wavernn/wavernn_inference_wrapper.py#L32-L129

mthrok avatar Oct 21 '21 22:10 mthrok

Is this tacotron2 related? Or is the method only for WaveRNN?

nateanl avatar Oct 26 '21 11:10 nateanl

It's for wavernn but implemented in tts pipeline. In this class.

https://github.com/pytorch/audio/blob/56f3b92746022cad8bd20f23b7a92023fb5560cc/torchaudio/pipelines/_tts/impl.py#L71-L96

mthrok avatar Oct 27 '21 01:10 mthrok