ParallelWaveGAN
ParallelWaveGAN copied to clipboard
Generated Samples Have an Audible Pop at End
I'm curious if anyone else has experienced this problem.
When training the MB-MelGAN, I use the batch_max_steps: 8192
as the length of the data. After training, although the audio quality is quite good, I find that there is a "pop" in audio at the end of the 8192-length audio. Here are plots below:
Does anyone have ideas in how to stabilize/avoid this problem? Thanks!
@kan-bayashi Sorry to bother you, but would you have any thoughts on this?