Metaworld Users are confused about goal conditioning

Users are confused about goal conditioning

Open krzentner opened this issue 1 year ago • 3 comments

Meta-World was designed to be both a Meta-RL and a Multi-Task RL benchmark. One of the awkward consequences of that is that the way goal conditioning is handled is very complicated in Meta-World. Specifically, all environments in Meta-World are goal conditioned, in every benchmark. However, goals are hidden in Meta-RL, and visible in Multi-Task RL. This is intended to make "goal inference" part of the Meta-RL objective. This allows ML1 to be used in a very similar way to older Meta-RL benchmark tasks (like HalfCheetahVelEnv or Ant Direction). However, Meta-RL requires that each task be a fully-observable MDP. This requires each "goal" to be considered a different task, and the API reflects this (a ML1 benchmark object contains 50 train task objects, ML10 contains 500 train task objects).

However, Meta-World uses the same API for both Meta-RL and Multi-Task RL. Consequently, using the Benchmark API, the goal is changed by passing one of the task objects to the set_task function. In particular, many users don't use the Benchmark API, and don't set the seeded_rand_vec flag either (which randomizes the goals on reset using the seed passed to the environment on init). This leads users to believe the environments are not goal conditioned, even though they definitely are supposed to be (50 goals per task, set by the seed). I don't know how many inconsistent results have been published because of this confusion, but at least a few.

TL;DR: Meta-RL requires ML10 to have 500 tasks, Multi-Task RL wants MT10 to have 10 tasks with 50 goals. This confuses users.

We should make the documentation and API more clear and harder to mis-use. A good first start would be renaming the seeded_rand_vec flag, and setting it to True by default in all of the environment constructors when not using Benchmark API. Unfortunately, this is a breaking change, and we haven't published any versioned package, so we should definitely make sure we have published at least one version of the package before we do this.

Mar 22 '23 01:03 krzentner

Metaworld Metaworld copied to clipboard

Users are confused about goal conditioning

Metaworld
Metaworld copied to clipboard