Hi, I got this issue:
Traceback (most recent call last):
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 1827, in
main()
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 1648, in main
for batch in train_dataloader:
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/accelerate/data_loader.py", line 454, in iter
current_batch = next(dataloader_iter)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in next
data = self._next_data()
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 759, in convert_to_tensors
tensor = as_tensor(value)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 721, in as_tensor
return torch.tensor(value)
TypeError: not a sequence
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 61, in fetch
return self.collate_fn(data)
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 559, in call
input_ids = self.description_tokenizer.pad(
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 3355, in pad
return BatchEncoding(batch_outputs, tensor_type=return_tensors)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 224, in init
self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 775, in convert_to_tensors
raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (input_ids
in this case) have excessive nesting (inputs type list
where type int
is expected).
i saw that
input_ids = [{"input_ids": feature["input_ids"]} for feature in features]
is list of list, so i changed it to:
input_ids = [{"input_ids": feature["input_ids"][0]} for feature in features]
and it worked
but then i got this:
Traceback (most recent call last):
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 1832, in
main()
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 1655, in main
loss, train_metric = train_step(batch, accelerator, autocast_kwargs)
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/./training/run_parler_tts_training.py", line 1584, in train_step
outputs = model(**batch)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/accelerate/utils/operations.py", line 822, in forward
return model_forward(*args, **kwargs)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/accelerate/utils/operations.py", line 810, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/home/neta_glazer_aiola_com/PycharmProjects/TTS/parler-tts/parler_tts/modeling_parler_tts.py", line 1995, in forward
encoder_outputs = self.text_encoder(
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1974, in forward
encoder_outputs = self.encoder(
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/transformers/models/t5/modeling_t5.py", line 1015, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/opt/conda/envs/parlenv/lib/python3.9/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
will be happy to get some help. tnx!
Hey @netagl , thanks for opening the issue!
This is odd, I'm pretty sure your fix doesn't apply in most of the cases, could you expand on the datasets you're using and the modifications you've done? It might have something to do with the error!
Also, feel free to send a dummy config to reproduce the error if you can, thanks !