openpi icon indicating copy to clipboard operation
openpi copied to clipboard

State/Action representation in LIBERO dataset (EEF vs Joint)

Open lyfeng001 opened this issue 2 months ago • 13 comments

Hello, thanks for your great work first!

I recently downloaded the LIBERO dataset, and I noticed something confusing regarding the representation of state and action in the feature files.

  • The documentation suggests that both state and action are represented in EEF (end-effector) space.
  • Since Panda is a 7-DOF arm, it makes sense that state could be represented as a 6D pose (position + orientation) plus 2D gripper values.
  • However, when I inspected the data, I found that the state values are numerically very far from the corresponding actions. Intuitively, since actions are control inputs and states are feedback signals, their values should be in a similar range.
Image
  • I also noticed that some dimensions of state are very close to 3.14, which makes me suspect that the state vector might actually be in joint space.

Additionally:

  • In Issue #95, it was mentioned that the state is in joint representation.
  • But if this is the case, then applying delta-action would essentially compute EEF - Joint, which does not have a clear physical meaning.
  • On the other hand, during inference the model input is constructed as:
np.concatenate(
    (
        obs["robot0_eef_pos"],
        _quat2axisangle(obs["robot0_eef_quat"]),
        obs["robot0_gripper_qpos"],
    )
)

This suggests that training should also be using EEF representation.

These points seem contradictory. Could you please clarify:

  1. What is the exact representation used for state and action in the dataset?
  2. Does delta-action have a meaningful interpretation in the current setup?
  3. What representation should be assumed during inference and training?

Thanks a lot for your help!

lyfeng001 avatar Sep 09 '25 07:09 lyfeng001