Amphion icon indicating copy to clipboard operation
Amphion copied to clipboard

Unable to run training script of Natural Speech 2

Open dongngm opened this issue 1 year ago • 8 comments

Hi,

I ran into multiple issues trying to run the training script: In ns2_dataset.py:

  • self.utt2phone[utt] = utt_info["phones"]: where phones comes from? I suspect we need to run the phonemizer first? but I don't see extract_phone=True in the config file
  • utt_info["num_frames"] is utt_info["Duration"], right?

In exp_config_base.json:

  • use_code=true, use_pitch=true, use_phone, should extract_acoustic_token=true, extract_pitch=true, extract_phone=true also?
  • There seems to be some mismatch between tts/preprocessing.py and the config file. For example: code_dir should be acoustic_token_dir?

dongngm avatar Dec 19 '23 10:12 dongngm

It has some differences for the data processing for NS2 between other TTS. We will update the data processing section as soon as possible.

HeCheng0625 avatar Dec 19 '23 11:12 HeCheng0625

Hi @HeCheng0625 ,

I hope this message finds you well. I understand that these things take time and effort, and I appreciate the work you're putting into it.

If possible, could you please provide an estimated timeline for when we might expect the update?

vn09 avatar Dec 22 '23 07:12 vn09

Hi, we will update a new checkpoint and data processing pipeline on a large dataset (> 1 w hours) in about two weeks. Now, we only use libritts to train the model. Now, we use our pretrained model on libritts: https://huggingface.co/amphion/naturalspeech2_libritts Or, try the toy demo: https://huggingface.co/spaces/amphion/NaturalSpeech2

HeCheng0625 avatar Dec 22 '23 08:12 HeCheng0625

Thanks @HeCheng0625.

vn09 avatar Dec 22 '23 10:12 vn09

Hi @HeCheng0625 , I just wanted to hear from you if there have been any updates on the data processing pipeline.

vn09 avatar Jan 06 '24 05:01 vn09

Any updates on the data preprocessing pipeline?

shreeshailgan avatar Apr 01 '24 10:04 shreeshailgan

Hello,@dongngm I encountered the same issue and have the same confusion. Do you have a solution to this problem? Any advice will be appreciated!

CreepJoye avatar May 20 '24 08:05 CreepJoye

@HeCheng0625 @RMSnow Do you have any updates on the preprocessing pipeline for neuralspeech2?

chazo1994 avatar Jun 11 '24 16:06 chazo1994