SpeechGPT
SpeechGPT copied to clipboard
loading cross-modal data
Hello,
I get the following error when loading dataset for stage 2. Is there a solution?
6: Traceback (most recent call last):
6: File "SpeechGPT/speechgpt/src/train/cm_sft.py", line 346, in <module>
6: train()
6: File "SpeechGPT/speechgpt/src/train/cm_sft.py", line 252, in train
6: data = load_dataset("json", data_files=data_args.data_path)
6: File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 2549, in load_dataset
6: builder_instance.download_and_prepare(
6: File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1005, in download_and_prepare
6: self._download_and_prepare(
6: File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1100, in _download_and_prepare
6: self._prepare_split(split_generator, **prepare_split_kwargs)
6: File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1860, in _prepare_split
6: for job_id, done, content in self._prepare_split_single(
6: File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 2016, in _prepare_split_single
6: raise DatasetGenerationError("An error occurred while generating the dataset") from e
6: datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset