vits icon indicating copy to clipboard operation
vits copied to clipboard

segmentation fault after train a few steps

Open zhufeijuanjuan opened this issue 1 year ago • 6 comments

segmentation fault appears after train a few steps when batch size >16, everything is ok when batch size <= 16. same issue exist when using dp training.

I use faulthander to track core dump happen in which step, then the logs shows that as follows: Current thread 0x00007fba40fe94c0 (most recent call first): File "xx/TTS/monotonic_align/init.py", line 40 in maximum_path File "xx/TTS//models/models.py", line 822 in forward File "xx/miniconda3/envs/pytorch2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501 in _call_impl File "train_multilang_speaker_1gpu.py", line 175 in train_and_evaluate File "train_multilang_speaker_1gpu.py", line 142 in run File "train_multilang_speaker_1gpu.py", line 50 in main File "train_multilang_speaker_1gpu.py", line 407 in

It seems related to monotonic_align, did anyone can help to solve it? Thanks.

zhufeijuanjuan avatar Aug 28 '23 09:08 zhufeijuanjuan