DiffSinger icon indicating copy to clipboard operation
DiffSinger copied to clipboard

RuntimeError: index 155 is out of bounds for dimension 1 with size 155

Open ElizavetaSedova opened this issue 2 years ago • 5 comments

I try to run training on my dataset. Valid data is processed correctly and this error does not occur at this stage. But when training data is used, a RuntimeError always occurs. I tried to analyze the tensors, look at their sizes, but there are no ideas, because they are identical to the valid ones. The only thing I noticed is that I have a lot of zero tensors at the end. But I'm not sure that this is an important point. Valid data was taken randomly of course. In fact, this part of code works correctly for valid data, but does not work for training data:

torch.gather(F.pad(encoder_out, [0, 0, 1, 0]), 1, mel2ph)

Please help, I would be glad to any ideas to solve this problem! image

https://github.com/MoonInTheRiver/DiffSinger/blob/5f2f6eb3c42635f9446363a302602a2ef1d41d70/modules/diffsinger_midi/fs2.py#L100

ElizavetaSedova avatar Nov 29 '22 18:11 ElizavetaSedova

I also want to make sure that for any type of DiffSinger training from scratch, I need to train the FFT-Singer model before. Is it so?

ElizavetaSedova avatar Nov 30 '22 11:11 ElizavetaSedova

I also want to make sure that for any type of DiffSinger training from scratch, I need to train the FFT-Singer model before. Is it so?

image These two, don't need.

MoonInTheRiver avatar Dec 01 '22 12:12 MoonInTheRiver

@MoonInTheRiver Thanks, but unfortunately it didn't solve my problem. When running the latest version of training (with PNDM), this error occurs even on my valid data. This is very strange, the data are the same. Also the shape of tensors coincides. Do you have any ideas why this happened and how can I avoid it?

image

The error occurs at the same place in the code.

ElizavetaSedova avatar Dec 01 '22 13:12 ElizavetaSedova

Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.

ElizavetaSedova avatar Dec 07 '22 10:12 ElizavetaSedova

Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.

Please check if your python packages are of the same version as our requirements.

MrZixi avatar Jan 19 '23 12:01 MrZixi

Unfortunately, I never found a solution. However, I noticed that when filtering the duration of samples up to 8 seconds in its dataset error disappears. I also have difficulty starting training on multiple GPUs. When learning on one GPU, the process is successful.

  1. degrade your torch version < 1.11 or
  2. follow

https://github.com/NATSpeech/NATSpeech/blob/main/utils/commons/ddp_utils.py#L14

to modify these part of codes:

https://github.com/MoonInTheRiver/DiffSinger/blob/master/utils/pl_utils.py#L187

MoonInTheRiver avatar Feb 04 '23 02:02 MoonInTheRiver