Rishikesh (ऋषिकेश) comments

Results 162 comments of


                                            Rishikesh (ऋषिकेश)

trafficstars

Pretrained checkpoint

I used LJSpeech, I add noise to LJSpeech and then tested on this repo. But I didn't get satisfactory results from it.

postnet parameters

@capavrulus I strictly following paper details here, although you can change dropout here as it's not mentioned in paper explicitly.

[in progress] fp16 memory optimizations

@nwatx Hi, is fp16 gives good output and speed up than normal ?

VoiceCraft with Parler-TTS's 10K hours speech data

@jasonppy Have you tried to use Vocos for decoding task rather than Encodec decoder, first of all it upsample the samples to 24 kHz and leads to clear crisp and...

VoiceCraft with Parler-TTS's 10K hours speech data

And another suggestion might improve quality of the Audio is to Replace Encodec fully with DAC similar to Parler-TTS (https://github.com/huggingface/parler-tts/blob/main/parler_tts/dac_wrapper/configuration_dac.py) . It results 44.1 kHz audio with 8 kbps bandwidth

Use this model for Voice conversion

I checked v4 branch looks good to me. Have you train the model if yes how's the quality?

Use this model for Voice conversion

I checked your v3 branch also and samples are sounding good. Have you train that model on any english dataset ?

Use this model for Voice conversion

I would like to train v3 and v4 for large english dataset. Would you guide me little bit. May HuBERT , XLS-R use to extract semantic vector or contentvec is...

Use this model for Voice conversion

Ok than I will try to train v4 only, but is that repo completed implemented or something remains ? If it's completed have run any kind of train on it...

Use this model for Voice conversion

dataset size? and on how many gpus you trained the model? Actually, I am planning to train this model on Multi-lingual Librispeech model which have 50k hours of data. But...