RLBench icon indicating copy to clipboard operation
RLBench copied to clipboard

About generating dataset and evaluating in the environment

Open AaronChuh opened this issue 1 year ago • 1 comments

Hi, thanks for this excellent work!

I want to generate some data to train an imitation learning model and evaluating in this environment. I read the codes and notice that there is a script named dataset_generator.py which can generate some data. I give it a try and it works. However, I'm a little confused about some configurations.

First, the action mode is set to JointVelocity by default and most of the states infomation are saved in low_dim_obs.pkl like gripper_pose. However, I want to train a model that predicts the pose of end effector and I find another action mode class named EndEffectorPoseViaPlanning. So if I train the model using gripper pose as end effector pose, can I just substitute the the action mode with EndEffectorPoseViaPlanning during evaluation?

Second, I find the gripper pose have a dimension of 7. Is it 3 for postion and 4 for rotation?

Looking forward to your reply, thanks!

AaronChuh avatar Dec 15 '24 08:12 AaronChuh

Any updates on this? Is the gripper pose 3 for position and 4 for rotation?

EDIT: I found the answer. The gripper pose is obtained with function get_pose() in the PyRep repository (specifically pyrep/objects/object.py )and it returns an array containing the (x,y,z,qx,qy,qz,qw) pose of the object.

The function also takes in a relative_to parameter incase that helps anyone.

olaghattas avatar Mar 12 '25 03:03 olaghattas

@AaronChuh updates on this?

alesspalma avatar Jun 26 '25 22:06 alesspalma

@AaronChuh updates on this?

Yes, I've found the answer last year.

For the questions I mentioned in this issue:

  1. Although the action mode is set to JointVelocity in dataset_generator.py, you can get gripper pose or gripper open state by obs.gripper_pose or obs.gripper_open in function save_demo. And if you want to evaluate an imitation learning model that predicts the pose of end effector, you only need to set
    rlbench_env = Environment(
        action_mode=MoveArmThenGripper(EndEffectorPoseViaPlanning(True), Discrete()),
        obs_config=obs_config,
        headless=True)

then launch the env and call obs, _, done = task_env.step(pose) where the pose is a 8-dim vector that concatenate position, quaternion and openness. 2. Yes.

AaronChuh avatar Jun 28 '25 10:06 AaronChuh

@AaronChuh awesome, thanks. Could you please take a look at https://github.com/stepjam/RLBench/issues/279 ? I am trying to reproduce a demo step by step via

rlbench_env = Environment(
        action_mode=MoveArmThenGripper(JointVelocity(), Discrete()),
        obs_config=obs_config,
        headless=False)

by extracting the observations from the saved demo and doing task_env.step(pose) for each observation, where each pose is the concatenation of extracted_obs.joint_velocities and extracted_obs.gripper_open, but the episodes are so bad and don't look like in the saved frames, the arm is even not completing the task most of the times. Can you help me?

alesspalma avatar Jun 28 '25 11:06 alesspalma

@AaronChuh awesome, thanks. Could you please take a look at #279 ? I am trying to reproduce a demo step by step via

rlbench_env = Environment(
        action_mode=MoveArmThenGripper(JointVelocity(), Discrete()),
        obs_config=obs_config,
        headless=False)

by extracting the observations from the saved demo and doing task_env.step(pose) for each observation, where each pose is the concatenation of extracted_obs.joint_velocities and extracted_obs.gripper_open, but the episodes are so bad and don't look like in the saved frames, the arm is even not completing the task most of the times. Can you help me?

I am afraid I can't help much because I didn't try the JointVelocity() mode when evaluating my model. But I think it's due to the gap between the env in which you try to reproduce an episode and the env in which you record that eposide. I check the rlbench codebase and find a method reset_to_demo(self, demo: Demo) in task_environment.py. This should reset the env to initial state when you record an episode.

Or may not... In #123 it's reported that there seems to be inevitable randomness in env reset, but a command may alleviate this issue. You can check it and have a try. Hope you can solve your problem!

AaronChuh avatar Jun 29 '25 04:06 AaronChuh

even if using reset_to_demo(self, demo: Demo) the arm is almost always not correctly replaying the loaded episode

alesspalma avatar Jun 29 '25 11:06 alesspalma