ManiSkill-Learn
ManiSkill-Learn copied to clipboard
Action space meanings for LiftCube
What are the physical meanings of each of the 7 action outputs in the LiftCube example you all gave here: https://colab.research.google.com/github/haosulab/ManiSkill2/blob/main/examples/tutorials/3_imitation_learning.ipynb ?
I did find some info on the arm controllers here: https://haosulab.github.io/ManiSkill2/concepts/controllers.html but I am still confused and would appreciate clarification on the action space. Thanks!
Hi, I'd recommend using ManiSkill2-Learn for ManiSkill2 experiments. The wrappers have been updated to accommodate ManiSkill2.
As for your question, the action space is
>>> env=gym.make('PickCube-v0', control_mode='pd_ee_delta_pose')
>>> env.agent.controller.controllers
OrderedDict([('arm', <mani_skill2.agents.controllers.pd_ee_pose.PDEEPoseController object at 0x7f1469869d60>), ('gripper', <mani_skill2.agents.controllers.pd_joint_pos.PDJointPosMimicController object at 0x7f1469869d90>)])
>>> env.agent.controller.controllers['arm'].action_space
Box(-1.0, 1.0, (6,), float32)
>>> env.agent.controller.controllers['gripper'].action_space
Box(-1.0, 1.0, (1,), float32)
i.e., 6d ee delta pose (3d delta position + 3d delta axis-angle rotation) + 1d gripper
Thanks for your reply. Please let me know if this interpretation is correct: 3d delta position = displacement of the joint in 3d space using units of, say centimeters 3d delta axis-angle rotation = rotation angles of the joint, scaled from 0 to 360 degrees to (-1,1) 1d gripper = some kind of force or tension measured in newtons that tells the gripper how hard to press? again scaled to (-1,1)
3d delta position = normalized displacement of the joint in 3d space. Normalization is in controller config, default (-1,1)->(-0.1m, 0.1m) 3d delta axis-angle rotation = delta rotation in axis-angle format (that is, r/||r||_2 = rotation axis, ||r|| * scaling = rotation degree, scaling is 0.1 by default in controller config) 1d gripper = open-close of gripper position. -1 = close; 1 = open; 0 = middle between open and close