blog icon indicating copy to clipboard operation
blog copied to clipboard

train a decision transformer

Open Sergiodmp opened this issue 2 years ago • 0 comments

accelerate 0.19.0 gym 0.21.0
huggingface-hub 0.14.1 numpy 1.24.3 packaging 23.1
pandas 2.0.1
transformers 4.29.2
Platform:I have tried in both Linux and Windows Python version: 3.8.10

I am trying to execute the colab of train a decision transformer on VS-Code, but when it comes to train the model I have the following issue.:

KeyError Traceback (most recent call last) Cell In[11], line 20 1 training_args = TrainingArguments( 2 output_dir="output/", 3 remove_unused_columns=False, (...) 10 max_grad_norm=0.25, 11 ) 13 trainer = Trainer( 14 model=model, 15 args=training_args, 16 train_dataset=trajectories, 17 data_collator=collator, 18 ) ---> 20 trainer.train()

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/transformers/trainer.py:1664, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs) 1659 self.model_wrapped = self.model 1661 inner_training_loop = find_executable_batch_size( 1662 self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size 1663 ) -> 1664 return inner_training_loop( 1665 args=args, 1666 resume_from_checkpoint=resume_from_checkpoint, 1667 trial=trial, 1668 ignore_keys_for_eval=ignore_keys_for_eval, 1669 )

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/transformers/trainer.py:1909, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval) 1906 rng_to_sync = True 1908 step = -1 -> 1909 for step, inputs in enumerate(epoch_iterator): 1910 total_batched_samples += 1 1911 if rng_to_sync:

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/torch/utils/data/dataloader.py:633, in _BaseDataLoaderIter.next(self) 630 if self._sampler_iter is None: 631 # TODO(https://github.com/pytorch/pytorch/issues/76750) 632 self._reset() # type: ignore[call-arg] --> 633 data = self._next_data() 634 self._num_yielded += 1 635 if self._dataset_kind == _DatasetKind.Iterable and
636 self._IterableDataset_len_called is not None and
637 self._num_yielded > self._IterableDataset_len_called:

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/torch/utils/data/dataloader.py:677, in _SingleProcessDataLoaderIter._next_data(self) 675 def _next_data(self): 676 index = self._next_index() # may raise StopIteration --> 677 data = self._dataset_fetcher.fetch(index) # may raise StopIteration 678 if self._pin_memory: 679 data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py:51, in _MapDatasetFetcher.fetch(self, possibly_batched_index) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

File ~/dt/dt_transformers_scrum/env_name/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py:51, in (.0) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

KeyError: 0

Does anyone know what is happening here? Thanks in advance

Sergiodmp avatar May 23 '23 13:05 Sergiodmp