dm_control icon indicating copy to clipboard operation
dm_control copied to clipboard

Humanoid_CMU stand, walk run with 2020 parameter values

Open dyth opened this issue 2 years ago • 4 comments

How could I use the 2020 humanoid_CMU torques, kp and damping values for the Humanoid_CMU stand, walk and run tasks?

Is there any way to do this that wouldn't require me manually changing the parameters in humanoid_CMU.xml?

dyth avatar Aug 15 '22 18:08 dyth

Sorry for late response. Can you explain why you want to do this? The 2020 version of the CMU Humanoid is only intended for use with the motion capture dataset for the Catch & Carry SIGGRAPH paper.

saran-t avatar Sep 05 '22 11:09 saran-t

I was hoping to do RL from scratch of a 56D humanoid from states. I saw that the 2019 parameters weren't intended for training from scratch, so I thought I'd use the 2020 parameters instead, having thought that the 2020 version was used in RL from scratch with humanoid-gaps and humanoid-parkour with V-MPO: Figure 4b, Page 8 and Sampled MuZero: Figure 6, Page 8. The stand, walk and run tasks seemed simpler than the gaps and parkour tasks and I was thus hoping to test my current algorithm on those tasks first.

dyth avatar Sep 05 '22 20:09 dyth

@yuvaltassa's comment about difficulty in training CMU humanoid using RL from scratch applies to all variations of the model. The point is that even if you can get high RL reward by training from scratch with this model, you'd probably end up with strange looking gaits (if you can consider them gaits at all).

I'd say if you're interested in training stand/walk/run from state then just use the stock dm_control.suite task. The main difference, however, is that the "v2019" and "v2020" CMU humanoid expose rescaled actuators, so that the agents send a control signals in the range [-1, 1] and the walker automatically scales up to the full actuation range. In the Control Suite, this isn't done for you, so you might need to put an action preprocessor in between your agent and the environment.

saran-t avatar Sep 05 '22 21:09 saran-t

Thanks for the information -- is this action preprocessor different from action normalization (ie just linearly rescaling the action space to [-1, 1])?

dyth avatar Sep 06 '22 13:09 dyth