TTS icon indicating copy to clipboard operation
TTS copied to clipboard

Trainer stops working without error messages

Open rezareza007 opened this issue 2 years ago • 1 comments

I'm trying to train TTS for a custom dataset (style of LJSpeech) with non-English alphabet. Here is the command I use:

python TTS\Lib\site-packages\TTS\bin\train_tts.py --config_path data/config.json

and here are the messages I get:

C:\Users\user\Desktop\patents\TTS\TTS\lib\site-packages\TTS\tts\models\tacotron2.py:272: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). alignment_lengths = ( C:\Users\user\Desktop\patents\TTS\TTS\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:2228.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

--> STEP: 0/14 -- GLOBAL_STEP: 0 | > decoder_loss: 1.37356 (1.37356) | > postnet_loss: 3.56385 (3.56385) | > stopnet_loss: 1.32900 (1.32900) | > decoder_coarse_loss: 1.37599 (1.37599) | > decoder_ddc_loss: 0.00168 (0.00168) | > ga_loss: 0.01330 (0.01330) | > decoder_diff_spec_loss: 0.12558 (0.12558) | > postnet_diff_spec_loss: 4.52326 (4.52326) | > decoder_ssim_loss: 0.71161 (0.71161) | > postnet_ssim_loss: 0.70824 (0.70824) | > loss: 5.17926 (5.17926) | > align_error: 0.99135 (0.99135) | > grad_norm: 2.13356 (2.13356) | > current_lr: 0.00000 | > step_time: 14.91710 (14.91709) | > loader_time: 9.09470 (9.09466)

C:\Users\user\Desktop\patents\TTS\TTS\lib\site-packages\TTS\tts\models\tacotron2.py:276: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). alignment_lengths = mel_lengths // self.decoder.r

But then it stops working right after this, meaning that no new messages are posted and no checkpoints are saved, CPU usage goes to near-zero BUT the program does not exit.

PS. Right after I posted this, it posted new output:

BEST MODEL : ./2nd-run\videon-1-June-27-2022_06+24PM-0000000\best_model_14.pth

Number of output frames: 5

EPOCH: 1/1000 --> ./2nd-run\videon-1-June-27-2022_06+24PM-0000000

DataLoader initialization | > Tokenizer: | > add_blank: False | > use_eos_bos: False | > use_phonemes: False | > 2 not found characters: | >

    | > ‌

| > Number of instances : 857 | > Preprocessing samples | > Max text length: 1128 | > Min text length: 2 | > Avg text length: 88.03733955659277 | | > Max audio length: 1810702.0 | > Min audio length: 22756.0 | > Avg audio length: 170498.6674445741 | > Num. instances discarded samples: 0 | > Batch group size: 256.

TRAINING (2022-06-28 09:50:49)

so maybe the issue is resolved, but training is very slow for such as small dataset (857 records).

rezareza007 avatar Jun 28 '22 05:06 rezareza007

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discourse page for further help. https://discourse.mozilla.org/c/tts

stale[bot] avatar Sep 21 '22 01:09 stale[bot]