Amphion
Amphion copied to clipboard
Unable to run training script of Natural Speech 2
Hi,
I ran into multiple issues trying to run the training script:
In ns2_dataset.py
:
-
self.utt2phone[utt] = utt_info["phones"]
: wherephones
comes from? I suspect we need to run the phonemizer first? but I don't seeextract_phone=True
in the config file - utt_info["num_frames"] is utt_info["Duration"], right?
In exp_config_base.json
:
-
use_code=true, use_pitch=true, use_phone
, shouldextract_acoustic_token=true, extract_pitch=true, extract_phone=true
also? - There seems to be some mismatch between
tts/preprocessing.py
and the config file. For example:code_dir
should beacoustic_token_dir
?
It has some differences for the data processing for NS2 between other TTS. We will update the data processing section as soon as possible.
Hi @HeCheng0625 ,
I hope this message finds you well. I understand that these things take time and effort, and I appreciate the work you're putting into it.
If possible, could you please provide an estimated timeline for when we might expect the update?
Hi, we will update a new checkpoint and data processing pipeline on a large dataset (> 1 w hours) in about two weeks. Now, we only use libritts to train the model. Now, we use our pretrained model on libritts: https://huggingface.co/amphion/naturalspeech2_libritts Or, try the toy demo: https://huggingface.co/spaces/amphion/NaturalSpeech2
Thanks @HeCheng0625.
Hi @HeCheng0625 , I just wanted to hear from you if there have been any updates on the data processing pipeline.
Any updates on the data preprocessing pipeline?
Hello,@dongngm I encountered the same issue and have the same confusion. Do you have a solution to this problem? Any advice will be appreciated!
@HeCheng0625 @RMSnow Do you have any updates on the preprocessing pipeline for neuralspeech2?