UniCATS-CTX-vec2wav icon indicating copy to clipboard operation
UniCATS-CTX-vec2wav copied to clipboard

[AAAI 2024] Code for CTX-vec2wav in UniCATS

Results 6 UniCATS-CTX-vec2wav issues
Sort by recently updated
recently updated
newest added

Hi @cantabile-kwok , I am curious to know have you tried this model for zero shot voice conversion use case ? Idea is very simple: ``` Source voice speech ->...

Hi @cantabile-kwok, in the paper, there was not any recommended text or phoneme tokenizer to use. Do you have recommendations of what to use? Thank you.

感谢大佬的开源!想请问可以分享一下 cmvn.ark 这个文件吗 🙏🏻🙏🏻🙏🏻 目前直接用没标准化的梅尔谱当 prompt,发音都很清晰,就是音色不太像,想看看标准化后的效果 🙏🏻🙏🏻🙏🏻 另外想确认下关于梅尔谱的参数: ``` prompt_wav, sr = librosa.load(prompt_src_wav_file, sr=16000) prompt = logmelspectrogram( x=prompt_wav.T, fs=16000, n_mels=80, n_fft=1024, n_shift=160, win_length=465, window="hann", fmin=80, fmax=7600).squeeze()[None, :, :] prompt =...

Hi @cantabile-kwok , I have also implemented UniCATS's vec2wav but that model is too slow, so I am curious to know the inference speed of this model. Actually, I am...

Hi @cantabile-kwok, I’ve been chipping away on the unofficial implementation of the UniCATS paper [here](https://github.com/francislata/unicats). Since the second part is out and it sounds like you’re working on the txt2vec...

您好,请问 vq_codebook 也是来自 vq-wav2vec-kmeans 吗?注意到代码中加载的是 (2, 320, 256),但是 vq-wav2vec-kmeans 中量化器的 embedding 的 shape 是 (320, 1, 256),两个 codebook 是一样的吗?