UniCATS-CTX-vec2wav issues

Use vec2wav for Speech to Speech Voice conversion

2

Hi @cantabile-kwok , I am curious to know have you tried this model for zero shot voice conversion use case ? Idea is very simple: ``` Source voice speech ->...

rishikksh20

Recommended text or phoneme tokenizer to use

3

Hi @cantabile-kwok, in the paper, there was not any recommended text or phoneme tokenizer to use. Do you have recommendations of what to use? Thank you.

francislata

关于 prompt 梅尔谱的标准化

11

感谢大佬的开源！想请问可以分享一下 cmvn.ark 这个文件吗 🙏🏻🙏🏻🙏🏻 目前直接用没标准化的梅尔谱当 prompt，发音都很清晰，就是音色不太像，想看看标准化后的效果 🙏🏻🙏🏻🙏🏻 另外想确认下关于梅尔谱的参数： ``` prompt_wav, sr = librosa.load(prompt_src_wav_file, sr=16000) prompt = logmelspectrogram( x=prompt_wav.T, fs=16000, n_mels=80, n_fft=1024, n_shift=160, win_length=465, window="hann", fmin=80, fmax=7600).squeeze()[None, :, :] prompt =...

hopingZ

Inference Speed

3

Hi @cantabile-kwok , I have also implemented UniCATS's vec2wav but that model is too slow, so I am curious to know the inference speed of this model. Actually, I am...

rishikksh20

Possible collaboration on CTXtxt2vec

3

Hi @cantabile-kwok, I’ve been chipping away on the unofficial implementation of the UniCATS paper [here](https://github.com/francislata/unicats). Since the second part is out and it sounds like you’re working on the txt2vec...

francislata

关于 vq_codebook

2

您好，请问 vq_codebook 也是来自 vq-wav2vec-kmeans 吗？注意到代码中加载的是 (2, 320, 256)，但是 vq-wav2vec-kmeans 中量化器的 embedding 的 shape 是 (320, 1, 256)，两个 codebook 是一样的吗？

hopingZ

UniCATS-CTX-vec2wav
UniCATS-CTX-vec2wav copied to clipboard

Metadata

Use vec2wav for Speech to Speech Voice conversion

Recommended text or phoneme tokenizer to use

关于 prompt 梅尔谱的标准化

Inference Speed

Possible collaboration on CTXtxt2vec

关于 vq_codebook

← Metadata

Owner

Metadata

UniCATS-CTX-vec2wav UniCATS-CTX-vec2wav copied to clipboard

Metadata

Use vec2wav for Speech to Speech Voice conversion

Recommended text or phoneme tokenizer to use

关于 prompt 梅尔谱的标准化

Inference Speed

Possible collaboration on CTXtxt2vec

关于 vq_codebook

← Metadata

Owner

Metadata

UniCATS-CTX-vec2wav
UniCATS-CTX-vec2wav copied to clipboard