imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Clean PyTorch implementations of imitation and reward learning algorithms

Results 146 imitation issues
Sort by recently updated
recently updated
newest added

## Description Support for stable_baselines3 style callbacks in adversarial training. This feature was partly addressed in #626, but the author seems to have lost interest there ## Testing Tests in...

## Description This PR updates the adversarial algorithm by training the discriminator between collecting the rollouts of the generator and training the generator. This matches the reference implementation provided in...

## Problem Robotic Env such as [SurRoL]( https://github.com/med-air/SurRoL) and [Fetch](https://robotics.farama.org/envs/fetch/) uses Dictionary Observation Space, with 1. observation 2. desired_goal 3. achieved_goal as keys. ## Query Is there a quick fix...

enhancement

I am testing 'imitation' to work with proprietary environments for the robot in our Mujoco-based lab. For testing I am generating Pygame environments based on Gym. I am creating user...

enhancement

## Problem Due to [this validation](https://github.com/HumanCompatibleAI/imitation/blob/5c85ebf02a591dad171946710d80617cfcca108e/src/imitation/data/types.py#L131) environments returning integer rewards will throw an exception, e.g. when I try to collect rollouts from an expert policy. This seems a bit overzealous....

enhancement

## Problem Today only synthetic preferences are supported. It would be great to support real human preferences. ## Solution Requirements: - record videos of trajectories - ideally, extensible so we...

enhancement

## Description See #711 ## Testing TODO: add notebook and experiment config that use this feature, and screenshots of behavior. (I've tested myself but not in a clean way.)

## Description Here are two scripts I used for checking for type errors in documentation and notebooks. I don't know whether this is of use to anyone. So I figured...

Right now we use Sacred to run experiments/algorithms. This PR is about exploring whether Hydra would be a good option for running experiments and constructing/configuring the CLI interface of `imitation`....

I wouldn't mind seeing something discussing whether these trajectory objects can only be used for imitation algorithms or can also be used for stable baselines3 or offline RL algorithms, and...