imitation Support for Dictionary Observation Space

Problem

Robotic Env such as SurRoL and Fetch uses Dictionary Observation Space, with

observation
desired_goal
achieved_goal as keys.

Query

Is there a quick fix to train an agent using BC in the current release for these kinds of environments?

Feb 14 '23 05:02 damaruga

It is not supported right now: https://github.com/HumanCompatibleAI/imitation/issues/651#issuecomment-1420219620

We are currently reworking the way data is handled in general. @AdamGleave do you think there is a quick fix? Should we aim to support this in the long term?

Maybe you can use the space utils to flatten/unflatten the space? https://gymnasium.farama.org/api/spaces/utils/

Feb 15 '23 18:02 ernestum

Hello! I have a similar issue as I need to use a dictionary input method. If one was to implement this, where would be a good starting point?

Mar 20 '23 06:03 tudorjnu

As long as you don't plan on treating different parts of the observation space (e.g. applying a CNN to image data but taking in accelerator data directly) you should be good with a gym.wrappers.FlattenObservation wrapper around your environment.

Mar 20 '23 09:03 ernestum

Personally that is roughly what I need as I have a custom environment generating different types of observations (i.e. image, joint pos, etc.).

On a tangent note, is it possible to generate transtitions using an environment (i.e. dict based env) and store only the image for example and then use imitation to learn those transitions (image -> action)?

I have been thinking to create a custom rollout wrapper or something along those lines so that when the expert generates the trajectories, they can be filtered. What do you think?

I asked here as it seems to be related to the issue. Thank you!

Mar 20 '23 09:03 tudorjnu

PR #785 partially addresses this. Once it's merged dictionary observations should be supported in:

Core functionality

[x] Collecting rollouts
[ ] Saving / writing trajectories to disk
[ ] Buffers

Algorithms:

[x] Behavioral Cloning
[ ] DAgger
[x] Density based reward modelling
[ ] MCEIRL
[ ] Adversarial AIRL / GAIL
[ ] SQIL
[ ] Preference Comparisons

Sep 16 '23 00:09 NixGD

imitation imitation copied to clipboard

Support for Dictionary Observation Space

Problem

Query

imitation
imitation copied to clipboard