legged_gym
legged_gym copied to clipboard
how does the privileged observation work?
thanks for your great contribution!
I notice that you use the privileged observation as critic obs for assymetric training in the PPO, but you haven`t mention this in the paper, Could you please explain this part more clearly?
Plus, I notice that in other works by your team the privileged observation is used for distillation that can be reconstructed in the student policy, is the two privileged observation the same? If so, how does it work?
Hi,
The privileged observations feature is implemented but were not using it in the paper. These privileged observations are not used in a teacher-student distillation. Instead, they are used in asymmetric actor-critic training, where the critic receives more information than the actor. This allows giving the critic information which won't be available on a real robot. The teacher-student distillation is 2nd step that has to be done after the RL training. It is not implemented in this code.
Hopefully, this clarifies the distinction.