Serhiy Stetskovych
Serhiy Stetskovych
``` def load_vocos(checkpoint_path, config_path, device): model = Vocos.from_hparams(config_path).to(device) raw_model = torch.load(checkpoint_path, map_location=device) raw_model = raw_model if 'state_dict' not in raw_model else raw_model['state_dict'] model.load_state_dict(raw_model, strict=False) model.eval() return model ```
Do you have a standard tensorboard logs? It is interesting to compare.
What is your validation loss on the last checkpoint? It is encoded in to the checkpoint file name. I am training 44100 for an almost a week already and loss...
> > Do you have a standard tensorboard logs? It is interesting to compare. > > > What is your validation loss on the last checkpoint? It is encoded in...
@LEECHOONGHO I have published my model here https://huggingface.co/patriotyk/vocos-mel-hifigan-compat-44100khz Sounds great, and there is metrics. @Mahmoud-ghareeb My model has been trained on 800+ hours of audio. Vocoder doesn't require text transcripts...
This model generates audio from mel spectrograms. The functionality that you tried just generates mel from audio and then back audio from mel. But real tts systmes generate mels directly...
It is because model has been changed since trained. You need to run pretrained model on older commit. Not sure, but I think this one should work b2f7d130470bce6a85ea1f4e2cb454cdc8ae9f55
No there aren't pretrained checkpoints with new changes.
@christophschuhmann Could you help with GPU?
Hm thank you, it is quite a lot. Do you plan to train SCNet on it, and share the weights?