About generating dataset and evaluating in the environment
Hi, thanks for this excellent work!
I want to generate some data to train an imitation learning model and evaluating in this environment. I read the codes and notice that there is a script named dataset_generator.py which can generate some data. I give it a try and it works. However, I'm a little confused about some configurations.
First, the action mode is set to JointVelocity by default and most of the states infomation are saved in low_dim_obs.pkl like gripper_pose. However, I want to train a model that predicts the pose of end effector and I find another action mode class named EndEffectorPoseViaPlanning. So if I train the model using gripper pose as end effector pose, can I just substitute the the action mode with EndEffectorPoseViaPlanning during evaluation?
Second, I find the gripper pose have a dimension of 7. Is it 3 for postion and 4 for rotation?
Looking forward to your reply, thanks!
Any updates on this? Is the gripper pose 3 for position and 4 for rotation?
EDIT: I found the answer. The gripper pose is obtained with function get_pose() in the PyRep repository (specifically pyrep/objects/object.py )and it returns an array containing the (x,y,z,qx,qy,qz,qw) pose of the object.
The function also takes in a relative_to parameter incase that helps anyone.
@AaronChuh updates on this?
@AaronChuh updates on this?
Yes, I've found the answer last year.
For the questions I mentioned in this issue:
- Although the action mode is set to
JointVelocityindataset_generator.py, you can get gripper pose or gripper open state byobs.gripper_poseorobs.gripper_openin functionsave_demo. And if you want to evaluate an imitation learning model that predicts the pose of end effector, you only need to set
rlbench_env = Environment(
action_mode=MoveArmThenGripper(EndEffectorPoseViaPlanning(True), Discrete()),
obs_config=obs_config,
headless=True)
then launch the env and call obs, _, done = task_env.step(pose) where the pose is a 8-dim vector that concatenate position, quaternion and openness.
2. Yes.
@AaronChuh awesome, thanks. Could you please take a look at https://github.com/stepjam/RLBench/issues/279 ? I am trying to reproduce a demo step by step via
rlbench_env = Environment(
action_mode=MoveArmThenGripper(JointVelocity(), Discrete()),
obs_config=obs_config,
headless=False)
by extracting the observations from the saved demo and doing task_env.step(pose) for each observation, where each pose is the concatenation of extracted_obs.joint_velocities and extracted_obs.gripper_open, but the episodes are so bad and don't look like in the saved frames, the arm is even not completing the task most of the times. Can you help me?
@AaronChuh awesome, thanks. Could you please take a look at #279 ? I am trying to reproduce a demo step by step via
rlbench_env = Environment( action_mode=MoveArmThenGripper(JointVelocity(), Discrete()), obs_config=obs_config, headless=False)by extracting the observations from the saved demo and doing
task_env.step(pose)for each observation, where eachposeis the concatenation ofextracted_obs.joint_velocitiesandextracted_obs.gripper_open, but the episodes are so bad and don't look like in the saved frames, the arm is even not completing the task most of the times. Can you help me?
I am afraid I can't help much because I didn't try the JointVelocity() mode when evaluating my model. But I think it's due to the gap between the env in which you try to reproduce an episode and the env in which you record that eposide. I check the rlbench codebase and find a method reset_to_demo(self, demo: Demo) in task_environment.py. This should reset the env to initial state when you record an episode.
Or may not... In #123 it's reported that there seems to be inevitable randomness in env reset, but a command may alleviate this issue. You can check it and have a try. Hope you can solve your problem!
even if using reset_to_demo(self, demo: Demo) the arm is almost always not correctly replaying the loaded episode