yiwei0730
yiwei0730
you need to download the data in network by yourself, and direct the data_dir to your own data dir
yes the data LJSpeech-1.1 and you just run the nvidia_preprocessing and write the data_dir to your data directory then its work
self.utterance_encoder(ys.transpose(1, 2)).transpose(1, 2) error This answer is in the other issue, you can find it in fastspeech.py line 314 have misss the third param ys. There have still some bugs...
Yes, I saw the max_seq_len = 1000 will set the input of encoder in the length. I will try to increase th number, thanks for your reply.
@DSTTSD bro this is a fake email don't click the button...
Where is the m4t_prepare_dataset scipts in the github, I can't find it, can you tell me?
感謝回復,我也是在mas和mfa之間想做個取捨,mfa處理過程複雜,mas收斂可能不容易,我看似乎lucidrain大神使用另一種AlignerNet : One TTS Alignment To Rule Them All。
想询问原神的资料要如何取得,或是您是否能够提供连结下载呢?
我使用一個250位語者的資料集,GPU一張,然後做訓練 bs設置32,不過我看預設的步數是100萬步,但總時長似乎需要666個小時,每2.5秒/it。 想問一下這樣子的速度是合理的嗎? 還是是有問題的。 您的兩張訓練一天大概是多少步數呢? 我顯存使用是38560MiB,一張GPU,bs 32。
感謝您的回覆,但如果是雙卡使用的話accelerate config 的 gpus 應該就要設定是兩個了。300000/1.3 /86000 = 2.6天左右, 我改成16bs後,還是只有1.05 it/s 慢了好多。 不知道是為什麼。 不過我有遇到,當訓練雙卡後,停止訓練後,再載入卻無法載入的情況(2 gpu時)。->有另開一個bug issue This is my setting compute_environment: LOCAL_MACHINE debug: false distributed_type: MULTI_GPU downcast_bf16: 'no' gpu_ids: 0,2 machine_rank:...