imitation
imitation copied to clipboard
Clean PyTorch implementations of imitation and reward learning algorithms
(apologies for the bad branch name) Includes: - A new class of CNN reward functions - A reward function wrapper to ensure inputs are always channel-first - Modifications to normalization...
## The current behavior It appears that `test_run_example_notebooks` is sometimes timing out when executing during CI checks. ## Expected behavior The tests should not be timing out. This should be...
Currently we have a GitHub workflow that uploads to test PyPI on every commit; and to real PyPI on the release. The idea of the test PyPI automation is to...
### 🚀 Feature When the reward function is trained by the `CrossEntropyRewardTrainer` in `preference_comparisons.py`, it currently takes a batch of trajectory fragments and preferences, calculates its reward for each pair...
https://github.com/HumanCompatibleAI/imitation/pull/484 removes the import to `imitation.envs.examples` from `src/imitation/scripts/__init__.py` to workaround https://github.com/sphinx-doc/sphinx/issues/9069 Some options: 1. Move this example env code out of the repo. It was never a great fit for...
This is a draft until https://github.com/DLR-RM/rl-baselines3-zoo/pull/257 is resolved.
## Description Add a basic developer guide to describe the implementation of the library.
## Description Closes #523. ### Problem - I personally found logging videos during training is really useful as another dimension of explaining experiment results. - Concretely, this issue advocates for...
Add support for saving videos of policies on a environment for evaluation during and after training
## Problem - I personally found logging videos during training is really useful as another dimension of explaining experiment results. - Concretely, this issue advocates for adding support for saving...
## Description Kept the incremental version of `EMANorm` in `update_stats_incremental`. Implemented the exact algorithm for computing EMA without any for loops. The runtime of the function has decreased by a...