imitation
imitation copied to clipboard
Clean PyTorch implementations of imitation and reward learning algorithms
See https://github.com/HumanCompatibleAI/imitation/pull/657 - replace hardcoded paper results with a CSV of results that we can compare and update Also, include instructions for updating the CSV, and a comparison between paper...
## Description This PR changes the adversarial algorithm such that the at any iteration, the rollouts are collected first, and then the discriminator is trained, followed by training the generator....
I think we should refactor the way we handle demonstrations inside of `imitation`. Skimming over the code it looks like we spend way too much LOC on supporting and converting...
## Problem The discriminator in AIRL here is just a regular one, not corresponding to the one in Fu's paper which can deal with robust dynamics. ## Solution As stated...
Hello 👋 This is a question, not a feature request - I hope that's alright. I understand that this repo doesn't support infinite horizon episodes. The gridworld environment I want...
## Problem Hi, the `imitation` is a great project! Currently, I am training GAIL algorithm, and the learner network is PPO in SB3. I have questions about the training process...
Our team [KABasalt](https://github.com/BASALT-2022-Karlsruhe) participated in last year's BASALT competition and we noticed that RLHP currently lacks support for human preferences. ## Problem: Only synchronous, synthetic preferences gathering is supported by...
## Problem The current DAgger implementation is a split into a `DAggerTrainer` and a `SimpleDAggerTrainer`. The split being mostly arbitrary. Also the dependency on BC is far too deep. ##...
## Description This addresses [Issue #523]( https://github.com/HumanCompatibleAI/imitation/issues#:~:text=Add%20support%20for%20saving%20videos%20of%20policies%20on%20a%20environment%20for%20evaluation%20during%20and%20after%20training) to automatically save videos during training time. This builds off of the following, earlier [PR](https://github.com/HumanCompatibleAI/imitation/pull/524/files#diff-cc891c802ce6c8a2e1fc96fc67e50e08a5e7f3158f6b35cd41d783b0744b26dd). Known Limitations: (1) Will not necessarily save a...
## Problem https://github.com/HumanCompatibleAI/imitation/pull/603#discussion_r1011673467 > I guess if we had code examples that are embedded directly from other files this would (a) solve the issue of testing docs separately, (b) solve...