iSTFTNet-pytorch icon indicating copy to clipboard operation
iSTFTNet-pytorch copied to clipboard

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Results 13 iSTFTNet-pytorch issues
Sort by recently updated
recently updated
newest added

tks very much!!!! ![iSTFT](https://user-images.githubusercontent.com/16432329/177454762-7f7af6e1-2c5c-4e16-ab8e-4d82806b52e8.png) https://user-images.githubusercontent.com/16432329/177454772-43428023-5ed3-4c9e-b8af-fb68f1a11ff3.mp4

Thanks for the implemention of ISTFT. It has better inference speed than hifigan v1.However, I found that there is a single frequency line which would cause little noise.I use 16KHZ...

Hi, thanks to the implement, the inference speed is impressive. How about the audio quality? And have you tried v2 config? Thanks in advance.

As the issue https://github.com/rishikksh20/iSTFTNet-pytorch/issues/1, the line 164 in `stft.py` was changed to https://github.com/rishikksh20/iSTFTNet-pytorch/blob/e928a6b604033a3857757562af36241f9225adfc/stft.py#L164 But `inverse_transform.device()` will raise the exception mentioned in the title. So it can be changed to `inverse_transform.device`...

![image](https://github.com/rishikksh20/iSTFTNet-pytorch/assets/28752526/ffd4be33-ae52-4d21-83f9-6de2a68365f3) https://github.com/rishikksh20/iSTFTNet-pytorch/blob/ecbf0f635b36432bd3e432790326591bc86cadbc/models.py#L97 https://github.com/rishikksh20/iSTFTNet-pytorch/blob/ecbf0f635b36432bd3e432790326591bc86cadbc/config_v1.json#L16 Why is fs 16?

The phase output of the generator currently can only range from -1 to 1, which is not enough as full phase in radians is expected later in `stft.inverse()` (either 0..2*pi...

Has anyone tried to directly model the complex numbers instead of the phase and magnitude? What would be the problem if we model the real and imaginary parts directly?