DiffSinger
DiffSinger copied to clipboard
RuntimeError: index 155 is out of bounds for dimension 1 with size 155
I try to run training on my dataset. Valid data is processed correctly and this error does not occur at this stage. But when training data is used, a RuntimeError always occurs. I tried to analyze the tensors, look at their sizes, but there are no ideas, because they are identical to the valid ones. The only thing I noticed is that I have a lot of zero tensors at the end. But I'm not sure that this is an important point. Valid data was taken randomly of course. In fact, this part of code works correctly for valid data, but does not work for training data:
torch.gather(F.pad(encoder_out, [0, 0, 1, 0]), 1, mel2ph)
Please help, I would be glad to any ideas to solve this problem!
https://github.com/MoonInTheRiver/DiffSinger/blob/5f2f6eb3c42635f9446363a302602a2ef1d41d70/modules/diffsinger_midi/fs2.py#L100
I also want to make sure that for any type of DiffSinger training from scratch, I need to train the FFT-Singer model before. Is it so?
I also want to make sure that for any type of DiffSinger training from scratch, I need to train the FFT-Singer model before. Is it so?
These two, don't need.
@MoonInTheRiver Thanks, but unfortunately it didn't solve my problem. When running the latest version of training (with PNDM), this error occurs even on my valid data. This is very strange, the data are the same. Also the shape of tensors coincides. Do you have any ideas why this happened and how can I avoid it?
The error occurs at the same place in the code.
Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.
Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.
Please check if your python packages are of the same version as our requirements.
Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.
- degrade your torch version < 1.11 or
- follow
https://github.com/NATSpeech/NATSpeech/blob/main/utils/commons/ddp_utils.py#L14
to modify these part of codes:
https://github.com/MoonInTheRiver/DiffSinger/blob/master/utils/pl_utils.py#L187