ClariNet icon indicating copy to clipboard operation
ClariNet copied to clipboard

High frequency in the Gaussian IAF?

Open azraelkuan opened this issue 6 years ago • 6 comments

there is a lot of noise in the high frequency?? do u have any solution?

azraelkuan avatar Nov 07 '18 14:11 azraelkuan

In my ClariNet repository, I found that using only generated means reduces noise. https://github.com/dhgrs/chainer-ClariNet/commit/576060561ba8b5a7b5e03d5c01aed4213cfbb6df

It means that predicting values instead of probability distributions.

dhgrs avatar Nov 08 '18 09:11 dhgrs

@dhgrs although it sounds unbelieveable, i will try it

azraelkuan avatar Nov 10 '18 16:11 azraelkuan

I think one possible reason is that there's no windowing process in STFT. https://github.com/ksw0306/ClariNet/blob/b03f99a64087e6eaf7682536b04379f1fe71b38a/modules.py#L117

Without windowing, we will see unexpected high-frequency values in the spectrum domain due to the discontinuity between edges in time domain.

r9y9 avatar Nov 14 '18 06:11 r9y9

I also test the stft function in the pytorch and a good spectorgram loss is very important. this is the predicted wav: image if we listen carefully, there will be some noise in the background

in waveglow and flowavenet, i also found some noise like https://github.com/ksw0306/FloWaveNet/issues/1#issue-378235981 but much smaller than this picture

azraelkuan avatar Nov 14 '18 06:11 azraelkuan

There is 3 high frequency noise in my synthesis with teacher model. Does anybody meet such kind of issue? Wave file is attached. Thank you. The generate_428934_0.wav is synthesized wav, and the generate_428934_0_truth.wav is the recorded wav used in training. wav.zip

hxs7709 avatar Nov 29 '18 01:11 hxs7709

Hi - anyone found the solution for high frequency noise?

anupam456 avatar Mar 18 '19 07:03 anupam456