IMS-Toucan icon indicating copy to clipboard operation
IMS-Toucan copied to clipboard

There is a problem with fine-tuning HiFiGAN

Open guo453585719 opened this issue 2 years ago • 6 comments

Hi, thank you for your contribution. I have a problem with fine-tuning HiFiGAN based on the pre-training model you gave me: hifigan_train_loop.py error on line 71 of check_dict "generator_optimizer" not found. I checked your release of v2. For HiFiGAN model version 2, only check_dict ["generator"] can be obtained. There is no way to get resources like "generator_optimizer" and "discriminator_optimizer". How can I solve this problem?

Here is the error code in hifigan_train_loop.py:

if path_to_checkpoint is not None: check_dict = torch.load(path_to_checkpoint, map_location=device) optimizer_g.load_state_dict(check_dict["generator_optimizer"]) optimizer_d.load_state_dict(check_dict["discriminator_optimizer"]) scheduler_g.load_state_dict(check_dict["generator_scheduler"]) scheduler_d.load_state_dict(check_dict["discriminator_scheduler"]) g.load_state_dict(check_dict["generator"]) d.load_state_dict(check_dict["discriminator"]) step_counter = check_dict["step_counter"]

guo453585719 avatar Oct 13 '22 03:10 guo453585719

HiFiGAN model for the v2.2 release:https://github.com/DigitalPhonetics/IMS-Toucan/releases/expanded_assets/v2.2

guo453585719 avatar Oct 13 '22 03:10 guo453585719

HiFiGAN model for the v2.2 release:https://github.com/DigitalPhonetics/IMS-Toucan/releases/expanded_assets/v2.2

guo453585719 avatar Oct 13 '22 03:10 guo453585719

HiFiGAN model for the v2.2 release:https://github.com/DigitalPhonetics/IMS-Toucan/releases/expanded_assets/v2.2

guo453585719 avatar Oct 13 '22 03:10 guo453585719

Hi! I just had a look and yes, it seems like there was a mistake with an accidential merge that was reverted that caused finetuning for the vocoder to break. I will add the functionality back in again.

Generally, it is not recommended to finetune the vocoder, because it is a generative adversarial network and they tend to collapse when not trained from scratch. HiFiGAN works about as good for unseen speakers as for seen speakers, so I don't think that it is necessary either. But yes, if you want to try, you should have the option. Next week, there will be a new release of the toolkit that comes with an improved vocoder (hopefully). So maybe it is best for you to wait until that is finally done.

Flux9665 avatar Oct 18 '22 23:10 Flux9665

HiFiGAN finetuning should now work again

Flux9665 avatar Oct 18 '22 23:10 Flux9665

Thank you very much for your advice

guo453585719 avatar Oct 19 '22 02:10 guo453585719