voicefixer_main
voicefixer_main copied to clipboard
questions for vocoder
Hi, @haoheliu. Thank you for your awesome work.
- After read code on the vocoder part, I found that there is only a pre-trained model and no training steps. Why is there no implementation of this part ? And under what circumstances is the pre-trained model obtained and how is its performance ?
- The vocoder part in the original TFGAN paper does not include the subband discriminator(there is also no implementation of this part). Because I did not see the relevant interpretation in the paper, what help or impact does the subband discriminator have on the model ?
If I can get an answer, it will help me a lot. Thank you.
Hi @LqNoob, I'm not sure if you still need the answer or not. Many apologize for the late reply. These are good questions.
- The implementation of TFGAN is confidential as the codebase of ByteDance, so I cannot open-source it. If you are interested you can refer to this repo, which has a similar implementation as ours. To achieve speaker-independent, you need to use at least 1000+ speakers in the training dataset.
- We use a subband discriminator to enhance the discriminative power of the GAN. We believe this can help TFGAN achieves a better vocoding result.
Thanks