imitation
imitation copied to clipboard
Image-based IL example
At the moment, all of our test environments have tabular observation spaces. It would be nice to include an example with a more complex observation space, like stacked images in Atari. We can put this in the documentation, or in an example notebook.
(I anticipate doing this as part of the project with Cynthia, Scott, et al.)
This is done for BC, albeit in the py3.6
branch: https://colab.research.google.com/github/qxcv/magical/blob/pyglet1.5/demo-notebook.ipynb
Will probably have GAIL working with image-based observations this week.
Would be resolved by https://github.com/HumanCompatibleAI/imitation/pull/221
Does the current version support Image observation space for BC and GAIL?
Not on master, but we do have it on the image-env-changes
branch. See this code in another repo for a BC example: https://github.com/HumanCompatibleAI/eirli/blob/master/src/il_representations/scripts/il_train.py#L281-L313 (although be warned that it's more complex than usual because it interfaces with the dataset/environment abstractions that were needed for a paper)
I want to use GAIL to train on atari env, can I do this at present? I cannot follow this issue, I'm a little lost..., could anybody help?
Nice examples but they seem to be implemented on the old version of Imitation library, are there more advanced examples similar to the colab one? I try to follow the steps but the parameter changed, such as policy_class, and policy_kwargs. Really appreciate if there are advanced examples on GAIL and BC.
I don't think we have any examples beyond those in the examples/
folder, although we'd be happy to see a PR adding these if you wanted to port qxcv's notebook.
I haven't touched this code in a while, so I'm not sure what would have to change in the example above. That said, this fresh PR contains a helpful example notebook that trains a preference comparison reward model on an image-based task. You might be able to adapt it to GAIL or BC, since most of the components are the same, and I think the interfaces are similar (if you do, consider sharing a Colab so we can use it as an example).
(edit: posted this before seeing Adam's comment above. I agree that a PR for a notebook port would be useful. The existing examples/
folder has BC and GAIL examples, so you might be able to adapt them without much effort by copy-pasting from the notebook in the PR linked above.)