Rishikesh (ऋषिकेश)

Results 160 comments of Rishikesh (ऋषिकेश)
trafficstars

@yoyololicon Do you have any voice samples generated from this repo ?

Thanks @cantabile-kwok I already following that repo. Will check end to end training

@cantabile-kwok https://arxiv.org/abs/2312.08676 SEF-VC architecture is same as CTX-vec2wav.

@francislata Have you used Coqui TTS tokenizer for your Unicats, is it working good ?

@lpscr are you able to converge the model ?

@adelacvg checked you update the model arch on `v4`. Is implementation completed? and is new model converge faster? I have collected lots of audio data now waiting for GPU availability...

Thanks :) Are you using HuBERT only for `context vector`? As my usecase is for non-english language so I thought to use Whisper layer 24 features rather than HuBERT.

Hi @adelacvg Is it possible to transfer bit Prosody and style also from NS2VC architecture not just voice? For simply voice conversion it working good, although voice not match exactly...

Just need to ask one more question, Are semantic tokens like Hu-BERT, wav2vec, and ContentVec have prosody information?

Yes, I have the same intuition because pronunciation is an integral part of linguistics.