Rishikesh (ऋषिकेश) comments

Results 160 comments of


                                            Rishikesh (ऋषिकेश)

trafficstars

hop_length questions

@yoyololicon Do you have any voice samples generated from this repo ?

Inference Speed

Thanks @cantabile-kwok I already following that repo. Will check end to end training

Use vec2wav for Speech to Speech Voice conversion

@cantabile-kwok https://arxiv.org/abs/2312.08676 SEF-VC architecture is same as CTX-vec2wav.

Recommended text or phoneme tokenizer to use

@francislata Have you used Coqui TTS tokenizer for your Unicats, is it working good ?

branch in V4 version train it's working ?

@lpscr are you able to converge the model ?

branch in V4 version train it's working ?

@adelacvg checked you update the model arch on `v4`. Is implementation completed? and is new model converge faster? I have collected lots of audio data now waiting for GPU availability...

branch in V4 version train it's working ?

Thanks :) Are you using HuBERT only for `context vector`? As my usecase is for non-english language so I thought to use Whisper layer 24 features rather than HuBERT.

branch in V4 version train it's working ?

Hi @adelacvg Is it possible to transfer bit Prosody and style also from NS2VC architecture not just voice? For simply voice conversion it working good, although voice not match exactly...

branch in V4 version train it's working ?

Just need to ask one more question, Are semantic tokens like Hu-BERT, wav2vec, and ContentVec have prosody information?

branch in V4 version train it's working ?

Yes, I have the same intuition because pronunciation is an integral part of linguistics.