when training this model by the our custom datasets we are facing the below error constantly "Dimension out of range (expected to be in range of [-1, 0], but got 1)"
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1) we have converted our dataset as LJspeech dataset . our custom dataset is on hugging face named "procit008/small" .
[rank0]:[W404 15:57:59.072467923 ProcessGroupNCCL.cpp:1496] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
Traceback (most recent call last):
File "/home/procit/procit/vits/train.py", line 290, in
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 90, in _wrap
fn(i, *args)
File "/home/procit/procit/vits/train.py", line 117, in run
train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, [train_loader, eval_loader], logger, [writer, writer_eval])
File "/home/procit/procit/vits/train.py", line 137, in train_and_evaluate
for batch_idx, (x, x_lengths, spec, spec_lengths, y, y_lengths) in enumerate(train_loader):
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 708, in next
data = self._next_data()
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1480, in _next_data
return self._process_data(data)
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1505, in _process_data
data.reraise()
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/_utils.py", line 733, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/procit/procit/vits/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
return self.collate_fn(data)
File "/home/procit/procit/vits/data_utils.py", line 119, in call
max_spec_len = max([x[1].size(1) for x in batch])
File "/home/procit/procit/vits/data_utils.py", line 119, in
I think this error is due to your audio files, this model requires the audio to be in mono and not stereo, that's why the error in the dimensions.
I don't think that's the case. I'm also getting this same error using LJSpeech (which is mono) and the default recipe.
andre@1080:~/projects/vits$ soxi DUMMY1/LJ018-0137.wav
Input File : 'DUMMY1/LJ018-0137.wav'
Channels : 1
Sample Rate : 22050
Precision : 16-bit
Duration : 00:00:06.97 = 153757 samples ~ 522.983 CDDA sectors
File Size : 308k
Bit Rate : 353k
Sample Encoding: 16-bit Signed Integer PCM
I also encountered this error while using LJSpeech. Do you know how to solve it?
I fixed but now can't remember how :/
Is the original ljs data reasonable? Is it necessary to print each one to check if the data conforms to the dimensions?
I fixed but now can't remember how :/
hello , did you make vits model with your voice ?
Hello, may I ask if you have fixed it and can use the LJSpeech dataset for training? How did you fix it? Did you modify the version of some installation packages or modify some code? If some code has been modified but you don't remember, can I refer to your code?
I fixed but now can't remember how :/
Yes, I managed to train a model using my own datasets but can't remember what I needed to do to fix this. I'll look at it in a couple of days again and can maybe find
Thank you. If you find it, please let me know how to solve it. Thank you very much for your help.
Yes, I managed to train a model using my own datasets but can't remember what I needed to do to fix this. I'll look at it in a couple of days again and can maybe find
I solved this problem mainly because the "def spectram_torch" function in the "melsprocessing. py" file called the "torch. sft()" function, but now I need to add an additional parameter "return_complex=False" to this function. Without this parameter (which was not added in the original code), an error will occur and you will be asked to specify it. If you specify it as true, the error you mentioned will occur. If you specify it as false, it will work normally. The other places in this file that call the 'torch. sft()' function have also been modified accordingly
Correct. My changes are in this PR https://github.com/jaywalnut310/vits/pull/229