Soshyant

Results 40 comments of Soshyant

what is the config you're using though? please show me your mel preprocessing steps

torch.manual_seed(some integer) right before running the inference function

any plan for a training code?

yeah, a few individuals such as yours truly have done it.

japanese on a 21hrs dataset, single speaker.

Hey. Not that I know of. this model is made of many modules, I doubt it'll be easy enough to do that.

it goes beyond these models, it's an HF thing i believe and not Litgpt

Codecs and to a great extend Vocoders are usually language agnostic. So you should be fine either way. alternatively Nvidia audio codec which was released a while ago, especially the...

> From the spectrogram on my end, extending the speech from 24kHz to 48kHz appears to be working fine. > > I’m not entirely sure if, by “the outputs are...

> @yxlu-0102 to extend bandwidth properly from 24kHz to 48kHz we have to run `python inference_16k.py` or `python inference_48k.py` ? if the target sampling rate is 48, then inference_48 must...