imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Clean PyTorch implementations of imitation and reward learning algorithms

Results 146 imitation issues
Sort by recently updated
recently updated
newest added

See https://github.com/HumanCompatibleAI/imitation/pull/657 - replace hardcoded paper results with a CSV of results that we can compare and update Also, include instructions for updating the CSV, and a comparison between paper...

## Description This PR changes the adversarial algorithm such that the at any iteration, the rollouts are collected first, and then the discriminator is trained, followed by training the generator....

I think we should refactor the way we handle demonstrations inside of `imitation`. Skimming over the code it looks like we spend way too much LOC on supporting and converting...

enhancement

## Problem The discriminator in AIRL here is just a regular one, not corresponding to the one in Fu's paper which can deal with robust dynamics. ## Solution As stated...

enhancement

Hello 👋 This is a question, not a feature request - I hope that's alright. I understand that this repo doesn't support infinite horizon episodes. The gridworld environment I want...

enhancement

## Problem Hi, the `imitation` is a great project! Currently, I am training GAIL algorithm, and the learner network is PPO in SB3. I have questions about the training process...

enhancement

Our team [KABasalt](https://github.com/BASALT-2022-Karlsruhe) participated in last year's BASALT competition and we noticed that RLHP currently lacks support for human preferences. ## Problem: Only synchronous, synthetic preferences gathering is supported by...

enhancement

## Problem The current DAgger implementation is a split into a `DAggerTrainer` and a `SimpleDAggerTrainer`. The split being mostly arbitrary. Also the dependency on BC is far too deep. ##...

enhancement

## Description This addresses [Issue #523]( https://github.com/HumanCompatibleAI/imitation/issues#:~:text=Add%20support%20for%20saving%20videos%20of%20policies%20on%20a%20environment%20for%20evaluation%20during%20and%20after%20training) to automatically save videos during training time. This builds off of the following, earlier [PR](https://github.com/HumanCompatibleAI/imitation/pull/524/files#diff-cc891c802ce6c8a2e1fc96fc67e50e08a5e7f3158f6b35cd41d783b0744b26dd). Known Limitations: (1) Will not necessarily save a...

## Problem https://github.com/HumanCompatibleAI/imitation/pull/603#discussion_r1011673467 > I guess if we had code examples that are embedded directly from other files this would (a) solve the issue of testing docs separately, (b) solve...

enhancement