RVT
RVT copied to clipboard
Bugs on "ignore_collision" when generating replay buffer
Hi RVT team, First, thank you for open-sourcing this impressive work! I'm attempting to reproduce RVT-2's RLBench results and encountered an inconsistency in the "slide block to color target" task that I'd like to discuss. Issue Description Reported vs Reproduced Results: While the paper reports 92% success rate, my implementation achieves only 48%.
Failure Analysis: Through video analysis of failed cases, I observed that the model frequently mispredicts the ignore_collision flag. When this flag is manually forced to True, success rate aligns with the paper's results (92-100%).
Key Insight: Since collision prediction should be a simpler subtask, I suspect a potential discrepancy in the data generation logic.
In dataset.py (specifically the _add_keypoints_to_replay function), I noticed the following implementation:
(
trans_indicies,
rot_grip_indicies,
ignore_collisions, # this ignore collisions flage corresponds to keypoint but not used
action,
attention_coordinates,
) = _get_action(
obs_tp1,
obs_tm1,
rlbench_scene_bounds,
voxel_sizes,
rotation_resolution,
crop_augmentation,
)
terminal = k == len(episode_keypoints) - 1
reward = float(terminal) * 1.0 if terminal else 0
obs_dict = extract_obs(
obs,
CAMERAS,
t=k - next_keypoint_idx,
prev_action=prev_action,
episode_length=25,
) # the ignore_collisions in the dict corresponds to current observation
Request Could you please clarify:
Whether this is a known implementation-paper discrepancy?
If modifying the ignore_collision source to keypoint observations would align with your original design?
Thank you for your time and insights!
https://github.com/user-attachments/assets/aeadada8-9e39-4a7b-901d-b1b74e7b91d0