IMS-Toucan
IMS-Toucan copied to clipboard
Where can I intercept the Mel Spectrogram to save it as .npy ?
Hi,
Out of curiosity, I want to test BigVGan. On their page they say that it accepts .npy
as input. I browsed the code but could not find where the Mel Spectrogram is generated.
Could you please show me the line of code that I can save to use BigVGan (manually) ?
Thanks in advance for your help
The next release will include BigVGAN, it's already in one of the experimental branches. It works extremely well, especially when it's paired with the discriminators that Avocodo adds. But it also very slow unfortunately.
Here are the spectrograms: https://github.com/DigitalPhonetics/IMS-Toucan/blob/e41e266ccacf282a9854d562f9e3d604f1cf245b/InferenceInterfaces/PortaSpeechInterface.py#L185
I'm not sure their spectrogram settings are the same as ours though, so not sure if their model will work out of the box with outputs from this TTS.
Thank you, will give this a try!