imitation
imitation copied to clipboard
Clean PyTorch implementations of imitation and reward learning algorithms
## Bug description I'm following along with the GAIL example, and am trying to log to tensorboard. If I understand the API correctly, I should be able to log directly...
## Bug description Hi everyone, I am trying to play with the DAgger example from `docs/tutorials/2_train_dagger.ipynb` but get the following error when executing the line `dagger_trainer.train(2000)`: ```bash Exception has occurred:...
## Bug description https://github.com/HumanCompatibleAI/imitation/blob/a8b079c469bb145d1954814f22488adff944aa0d/src/imitation/data/types.py#L473 While `TransitionsMinimal` doesn't have the field `next_obs`, the collate function called during data loading expects there to be `next_obs`, making `TransitionsMinimal` unusable with e.g. `BC`. ##...
## Bug description KeyError: The observation_space and action_space were not given, can't verify new environments. ## Steps to reproduce The code to reproduce is from Imitation's official documents. (https://imitation.readthedocs.io/en/latest/tutorials/1_train_bc.html) ```python...
## Description HuggingFace save_to_disk takes `PathLike` type which is defined as str, bytes or os.PathLike. imitation.util.parse_path always returned pathlib.Path which is not one of these types. This commit converts pathlib.Path...
## Question Hello, I am trying to implement a callback during GAIL training that retrieves the mean/gen/train/loss value and creates a checkpoint whenever this loss reaches a new minimum. However,...
## Bug description A `imitation.policies.base.FeedForward32Policy` that is saved using `policy.save()` cannot be loaded with `imitation.policies.base.FeedForward32Policy.load()`, raising the following error: ## Steps to reproduce Train a policy using `imitation.algorithms.bc.BC`, then save...
The temporary/stupid fix to the bug described in #857.
## Bug description Currently for some reason the termination condition that equalizes the horizon length for each rollouts is not work properly and thus generates variable horizon error. ## Steps...
## Bug description In predict_th, there's an assert statement that says the following: assert rew_th.shape == state.shape[:1] This will fail, even if you've modified state_th using self.preprocess to be a...