NS2VC icon indicating copy to clipboard operation
NS2VC copied to clipboard

Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech

Results 21 NS2VC issues
Sort by recently updated
recently updated
newest added

想要請教您幾個問題 1. 想請問diff-vits這個項目與ns2 tts-v2的差別在哪裡 目前粗略看過去以及以前有看到,似乎是將主模型改成vits但留下了naturalspeech的架構? 2. 我在tts-v2的模型中測試了一個1500+音色 600+hr的訓練資料集,測試集外數據還是會有大部分不太相似的情況。 是否真如論文所測試,需要更大量的數據集才能有集外的泛化性效果。您認為大概需要多少小時和多少資料以上的音色才能有較好的結果。 3. 想請問您覺得MFA所預測出來的ground-truth duration與利用MAS預測出來的duration 兩者的差別在哪,您似乎比較偏好於MAS的預測系統。

数据应该怎么准备呢?需要wav格式和lab文件格式还是其他的什么?

Hi, Thanks for the amazing work. I put the .wav files in dataset. and I process it with the below code, python preprocess.py It creates dataset_processed folder with contents. Then...

如题,demo展示的效果已经不错了,请教下是使用了多少说话人的的语料库

how to use command "accelerate config" to generate the /home/duser/.cache/huggingface/accelerate/default_config.yaml. I have try the two configuration that could run accelrate launch train.py successfully. but seems the model parallel training not...

dataset.py check: assert abs(codes.size(-1) - sum(duration)) < 3, (codes.size(-1), sum(duration), filename) assert abs(audio.shape[1]-lmin * self.hop_length) < 3 * self.hop_length why to check the encode and duration?

if yes, congratulation! can you show your size of dataset, GPU resource,trian time?

Hi, When running preprocess.py, I get this error: RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR Could you guide me on solving this? This is the stack trace: Loading hubert for content... load model(s)...