imitation icon indicating copy to clipboard operation
imitation copied to clipboard

Clean PyTorch implementations of imitation and reward learning algorithms

Results 146 imitation issues
Sort by recently updated
recently updated
newest added

Tune hyperparameters / match implementation details / fix bugs until we replicate the performance of reference implementations of algorithms. I'm not concerned about an exact match -- if we do...

We commented out some type annotations to workaround https://github.com/google/pytype/issues/1108 in https://github.com/HumanCompatibleAI/imitation/pull/393

The maximum causal entropy IRL algorithm is implemented in https://github.com/HumanCompatibleAI/imitation/blob/master/src/imitation/algorithms/mce_irl.py but there is not currently a training script for it in https://github.com/HumanCompatibleAI/imitation/tree/master/src/imitation/scripts

Currently only state-based rewards are supported. Ideally we'd allow state-action based rewards as well. This would be easy from the RewardNet side, but would also require support to calculate state-action...

enhancement

Currently we don't have any CLI script for behavioral cloning or the density baseline. I envisage this codebase as being particularly useful in being able to rapidly benchmark against a...

enhancement

At the moment, GAIL and BC don't interoperate well with SB3 in environments with image-based observation spaces. The main problem is the channels axis: many environments put channels last, but...

h/t @qxcv `AdversarialTrainer.train()` will repeatedly call `PPO.learn(total_timesteps=gen_batch_size, reset_num_timesteps=False)` where `gen_batch_size` is usually a small number compared to conventional RL training. Whether or not `reset_num_timestep=False`, `PPO` doesn't know the actual number...

Currently `RewardVecEnvWrapper` replaces the reward directly, and internally keeps track of episode return that it logs using the `log_callback`. However, we often apply subsequent wrappers such as `VecNormalize` that change...

## Description Fixes #560. ## Testing Ran all tests. Results: 3543 passed, 595 skipped, 4234 warnings.

## Bug description Attempting to load SB3 models from Huggingface in `serialize.py` often raises a `FileExistsError`, that tells us "Outdated policy format: we do not support restoring normalization statistics from...

bug