rlkit icon indicating copy to clipboard operation
rlkit copied to clipboard

unable to test the learnt model because customed_goal_sampler is not loaded

Open ZiwenZhuang opened this issue 5 years ago • 5 comments

Hi, when I was running run_goal_conditioned_policy.py, an error occurred saying custom_goal_sampler is None. And I checked that VAEWrappedEnv.__getstate__ ignored its _custom_goal_sampler. Then there is no data load while the object is loaded from .pkl file.

Is there any method to make the goal sampling work while testing the trained model?

ZiwenZhuang avatar Jul 23 '19 20:07 ZiwenZhuang

Good catch. You'd have to set the custom_goal_sampler after creating the environment. If you search for how it's set during training, you'll see how to do it (look in the launcher.py file). A PR to add this to the run_goal_conditioned_policy.py file would be welcomed!

vitchyr avatar Jul 23 '19 20:07 vitchyr

Something confused me. During training, it seems the VAEWrapperedEnv object's __goal_sampling_mode has never been to 'custom_goal_sampler', why it suddenly changed into 'custom_goal_sampler' during loading?

BTW, I'm running using sawyer_door.py.

Edit: And it seems like the sampler is using a function from a replay_buffer, which is only available after a proper time of rollout. This creates a chicken-egg problem. And I'm not sure I have fully understood the repository control flow.

ZiwenZhuang avatar Jul 23 '19 21:07 ZiwenZhuang

Yes, so in that experiment, during autonomous exploration, we sample goals from the replay buffer since the policy doesn't have access to any oracle sampler. However, for testing you might want to have some "oracle goal sampler" and you can implement that using the custom goal sampler.

vitchyr avatar Jul 24 '19 15:07 vitchyr

So, there are some 'software engineering' problems. If I'm going to implement an "oracle goal sampler", what is the required output in order to match the repository interface?

ZiwenZhuang avatar Jul 27 '19 02:07 ZiwenZhuang

Good question! Yeah, this part of the code is a bit "untyped"/undocumented. Based on VAEWrappedEnv.samples_goals, the signature should be

def goal_sampler_function(batch_size):
  return dict_of_goals

where dict_of_goals is something like:

{
  'desired_goals': NUMPY_ARRAY # shape (batch_size x latent_dim),
  'other_desired_goal: NUMPY_ARRAY # shape (batch_size x feature_dim),
}

vitchyr avatar Aug 06 '19 17:08 vitchyr