RVT icon indicating copy to clipboard operation
RVT copied to clipboard

Bugs on "ignore_collision" when generating replay buffer

Open LPY1219 opened this issue 7 months ago • 0 comments

Hi RVT team, First, thank you for open-sourcing this impressive work! I'm attempting to reproduce RVT-2's RLBench results and encountered an inconsistency in the "slide block to color target" task that I'd like to discuss. Issue Description Reported vs Reproduced Results: While the paper reports 92% success rate, my implementation achieves only 48%.

Failure Analysis: Through video analysis of failed cases, I observed that the model frequently mispredicts the ignore_collision flag. When this flag is manually forced to True, success rate aligns with the paper's results (92-100%).

Key Insight: Since collision prediction should be a simpler subtask, I suspect a potential discrepancy in the data generation logic.

In dataset.py (specifically the _add_keypoints_to_replay function), I noticed the following implementation:

(
    trans_indicies,
    rot_grip_indicies,
    ignore_collisions,  #  this ignore collisions flage corresponds to keypoint but not used
    action,
    attention_coordinates,
) = _get_action(
    obs_tp1,
    obs_tm1,
    rlbench_scene_bounds,
    voxel_sizes,
    rotation_resolution,
    crop_augmentation,
)
terminal = k == len(episode_keypoints) - 1
reward = float(terminal) * 1.0 if terminal else 0

obs_dict = extract_obs(
    obs,
    CAMERAS,
    t=k - next_keypoint_idx,
    prev_action=prev_action,
    episode_length=25,
)  # the ignore_collisions in the dict corresponds to current observation

Request Could you please clarify:

Whether this is a known implementation-paper discrepancy?

If modifying the ignore_collision source to keypoint observations would align with your original design?

Thank you for your time and insights!

https://github.com/user-attachments/assets/aeadada8-9e39-4a7b-901d-b1b74e7b91d0

LPY1219 avatar Apr 05 '25 04:04 LPY1219