lerobot
lerobot copied to clipboard
OXE rlds data format to hg lerobot format
What this does
A generic implementation to convert existing OXE dataset (RTX) which is in standard RLDS to Lerobot dataformat.
The current open-x-embodiment RLDS dataset is still "messy". Here, the standardization across all dataset is done by applying datasets/push_dataset_to_hub/oxe/transforms.py
and datasets/push_dataset_to_hub/oxe/configs.py
(e.g. similar to octo and openvla). Despite this, user still has the flexibility to convert existing "non-oxe" rlds format to lerobot.
Example
To upload a sample dataset austin_buds_dataset_converted_externally_to_rlds
, run:
python lerobot/scripts/push_dataset_to_hub.py \
--raw-dir /hdd/tensorflow_datasets/austin_buds_dataset_converted_externally_to_rlds/0.1.0/ \
--repo-id youliangtan/austin_buds_dataset_converted_externally_to_rlds_to_hg \
--raw-format oxe_rlds.austin_buds_dataset_converted_externally_to_rlds # format is in: "oxe_rlds.<OXE_DATASET_NAME>"
The official <OXE_DATASET_NAME> is listed here.
The uploaded data is located here in: https://huggingface.co/datasets/youliangtan/austin_buds_dataset_converted_externally_to_rlds_to_hg
Sample OXE dataset on hg:
-
youliangtan/sampled_bridge_data_v2
(only 5 trajectories as example) -
youliangtan/austin_buds_dataset_converted_externally_to_rlds_to_hg
-
youliangtan/nyu_franka_play_dataset_converted_externally_to_rlds_to_hg
-
youliangtan/cmu_stretch_to_hg
-
youliangtan/droid_100_to_hg
(first 100 trajectories from droid data) -
youliangtan/sampled_google_robot_ds
(only 5 trajectories as example)
Try visualize data locally:
python lerobot/scripts/visualize_dataset.py --repo-id youliangtan/sampled_bridge_data_v2 --episode-index 2
TODO:
- test with more datasets
- clean up impl
Open questions
- The current impl uses default T5 language encoder, with some default kwargs
- make
MultiLeRobotDataset
support custom sampling weights.