imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Clean PyTorch implementations of imitation and reward learning algorithms

Results 146 imitation issues
Sort by recently updated
recently updated
newest added

## Bug description ``` Traceback (most recent call last): File "/home/kavin/Documents/PycharmProjects/RL/Imitation/example.py", line 150, in bc_trainer.train(n_epochs=1) File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 495, in train training_metrics = self.loss_calculator(self.policy, obs_tensor, acts) File "/home/kavin/anaconda3/envs/PythonEnv/lib/python3.8/site-packages/imitation/algorithms/bc.py", line 130,...

bug

## Bug description The current implementation of the `SyntheticGatherer` in the preference comparisons module often chooses the trajectory with the higher reward nearly deterministically. This is because the Boltzmann-rational policy...

bug

When run examples/quickstart.py, i'm getting error with RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for...

bug

Is it possible to use the trajectories (.npz) collected as expert response which is compatible for SB3 IRL models. ? I made an env which take user input to move...

enhancement

## Problem If the observations for a given task are images and stored using the `TrajectoryDatasetSequence ` class, indexing is extremely slow. For instance, indexing one trajectory can take upwards...

enhancement

## Problem In the last month we doubled our median pipeline runtime and we also spend a lot of CirlceCI credits on failed pipelines ## Solution 1. Make the pipeline...

enhancement

This adds some instructions on running the benchmarking suite. It is still missing the baseline benchmark values and instructions to update them, which I'll include in a later PR. I'm...

## Description Add MCE IRL training script for #392

## Description 1. Add an environment wrapper to keep the original observation and rgb version together for interactive policy 2. Remove the rgb observation and its space in the bc...

## Problem The issue is based on efforts from #776 and it works as a last step for #701. For more detail, please see the discussion [here](https://github.com/HumanCompatibleAI/imitation/issues/701#issuecomment-1712411254) ## Solution -...

enhancement