Soshyant
Soshyant
what is the config you're using though? please show me your mel preprocessing steps
torch.manual_seed(some integer) right before running the inference function
any plan for a training code?
yeah, a few individuals such as yours truly have done it.
japanese on a 21hrs dataset, single speaker.
Hey. Not that I know of. this model is made of many modules, I doubt it'll be easy enough to do that.
it goes beyond these models, it's an HF thing i believe and not Litgpt
Codecs and to a great extend Vocoders are usually language agnostic. So you should be fine either way. alternatively Nvidia audio codec which was released a while ago, especially the...
> From the spectrogram on my end, extending the speech from 24kHz to 48kHz appears to be working fine. > > I’m not entirely sure if, by “the outputs are...
> @yxlu-0102 to extend bandwidth properly from 24kHz to 48kHz we have to run `python inference_16k.py` or `python inference_48k.py` ? if the target sampling rate is 48, then inference_48 must...