imitation
imitation copied to clipboard
Clean PyTorch implementations of imitation and reward learning algorithms
## Bug description My goal is to pre-train a policy with BC and fine tune it with RL, e.g., PPO. The problem is that I cannot find an example for...
Hello, I am working on GAIL and would like to visualize its convergence over iterations. This visualization will help in understanding the performance and stability of the algorithm. I am...
## Feature description Hi, what is the reason for doing the discriminator optimizer step in [`common.py`](https://github.com/HumanCompatibleAI/imitation/blob/master/src/imitation/algorithms/adversarial/common.py) outside of the batch loop (see the [line 372](https://github.com/HumanCompatibleAI/imitation/blob/e5ef18806c449ca47153b494a02471c5e2ae3a14/src/imitation/algorithms/adversarial/common.py#L372) with `self._disc_opt.step()` or the attached...
## Bug description I am new to imitation learning package. The issue I am facing now is that, I want to train an agent using GAIL. However, I keep getting...
## Bug description Description of what the bug is. AssertionError: Tuple observations are not supported. ## Steps to reproduce Code or a description of how to reproduce the bug. import...
## Description Fix typos in the reward_network.rst ## Testing NA