imitation
imitation copied to clipboard
Clean PyTorch implementations of imitation and reward learning algorithms
## Bug description ``` Traceback (most recent call last): File "/home/kavin/Documents/PycharmProjects/RL/Imitation/example.py", line 150, in bc_trainer.train(n_epochs=1) File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 495, in train training_metrics = self.loss_calculator(self.policy, obs_tensor, acts) File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 130,...
## Bug description The current implementation of the `SyntheticGatherer` in the preference comparisons module often chooses the trajectory with the higher reward nearly deterministically. This is because the Boltzmann-rational policy...
When run examples/quickstart.py, i'm getting error with RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for...
Is it possible to use the trajectories (.npz) collected as expert response which is compatible for SB3 IRL models. ? I made an env which take user input to move...
## Problem If the observations for a given task are images and stored using the `TrajectoryDatasetSequence ` class, indexing is extremely slow. For instance, indexing one trajectory can take upwards...
## Problem In the last month we doubled our median pipeline runtime and we also spend a lot of CirlceCI credits on failed pipelines ## Solution 1. Make the pipeline...
This adds some instructions on running the benchmarking suite. It is still missing the baseline benchmark values and instructions to update them, which I'll include in a later PR. I'm...
## Description Add MCE IRL training script for #392
## Description 1. Add an environment wrapper to keep the original observation and rgb version together for interactive policy 2. Remove the rgb observation and its space in the bc...
## Problem The issue is based on efforts from #776 and it works as a last step for #701. For more detail, please see the discussion [here](https://github.com/HumanCompatibleAI/imitation/issues/701#issuecomment-1712411254) ## Solution -...