yiwei0730 comments

Results 14 comments of


                                            yiwei0730

yaml

you need to download the data in network by yourself, and direct the data_dir to your own data dir

yaml

yes the data LJSpeech-1.1 and you just run the nvidia_preprocessing and write the data_dir to your data directory then its work

self.utterance_encoder(ys.transpose(1, 2)).transpose(1, 2) error This answer is in the other issue, you can find it in fastspeech.py line 314 have misss the third param ys. There have still some bugs...

The audio time problem in the synthesis

Yes, I saw the max_seq_len = 1000 will set the input of encoder in the length. I will try to increase th number, thanks for your reply.

yaml

@DSTTSD bro this is a fake email don't click the button...

Finetuning TTS, training crashes after eval

Where is the m4t_prepare_dataset scipts in the github, I can't find it, can you tell me?

diff-vits vs NS2 tts-v2

感謝回復，我也是在mas和mfa之間想做個取捨，mfa處理過程複雜，mas收斂可能不容易，我看似乎lucidrain大神使用另一種AlignerNet : One TTS Alignment To Rule Them All。

展示的demo效果用了多少语料

想询问原神的资料要如何取得，或是您是否能够提供连结下载呢?

展示的demo效果用了多少语料

我使用一個250位語者的資料集，GPU一張，然後做訓練 bs設置32，不過我看預設的步數是100萬步，但總時長似乎需要666個小時，每2.5秒/it。想問一下這樣子的速度是合理的嗎? 還是是有問題的。您的兩張訓練一天大概是多少步數呢? 我顯存使用是38560MiB，一張GPU，bs 32。

展示的demo效果用了多少语料

感謝您的回覆，但如果是雙卡使用的話accelerate config 的 gpus 應該就要設定是兩個了。300000/1.3 /86000 = 2.6天左右，我改成16bs後，還是只有1.05 it/s 慢了好多。不知道是為什麼。不過我有遇到，當訓練雙卡後，停止訓練後，再載入卻無法載入的情況(2 gpu時)。->有另開一個bug issue This is my setting compute_environment: LOCAL_MACHINE debug: false distributed_type: MULTI_GPU downcast_bf16: 'no' gpu_ids: 0,2 machine_rank:...