lerobot
lerobot copied to clipboard
Is there any documentation to create a custom dataset?
lerobot/examples elaborates on how to load and train using the existing dataset on the hugging face repos.
Rather I'd like to know how to turn self-collected data into the dataset.
So, I'd like to if there is some documentation for that.
Same here. I have my data ready but the datasets class seem rather complex to instantiate, so one (or more depending on the number of camera for examples) examples would be very nice.
same here!I really need an example.
same question!
@TheArtificialOutsider @zwbx @RochMollero Got it! We will address this issue very soon, and simplify stuff ;)
Any chance you could provide a very short sample of the datasets in the comment?
In the meantime, a few pointers and ressources:
README:
- https://github.com/huggingface/lerobot?tab=readme-ov-file#the-lerobotdataset-format
See how we use from_preloaded:
- https://github.com/huggingface/lerobot/blob/01f8cede0b5f1c16330205b35f4391939e11cb3e/lerobot/scripts/control_robot.py#L336-L344 (see
dataset.stats = statsto use it directly after) - https://github.com/huggingface/lerobot/blob/main/lerobot/scripts/push_dataset_to_hub.py#L247-L258
See the content of these files to instantiate the hf_dataset, encode the videos, or store frames, etc.
- from hdf5: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/aloha_hdf5_format.py#L209-L211
- from zarr: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/pusht_zarr_format.py#L257-L259
- from parquet: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/dora_parquet_format.py#L213-L215
- from pickle: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/xarm_pkl_format.py#L176-L178
cc @michel-aractingi for visibility ;)
Hi, thanks for your attention to this matter. I am using RLbench dataset now. I have raw data now, containing image observations, actions. how can I organize them and transfer them to the hf dataset?
Get Outlook for iOShttps://aka.ms/o0ukef
From: Remi @.> Sent: Thursday, July 11, 2024 7:08:41 AM To: huggingface/lerobot @.> Cc: Wenbo Zhang @.>; Mention @.> Subject: Re: [huggingface/lerobot] Is there any documentation to create a custom dataset? (Issue #304)
CAUTION: External email. Only click on links or open attachments from trusted senders.
cc @michel-aractingihttps://github.com/michel-aractingi for visibility ;)
— Reply to this email directly, view it on GitHubhttps://github.com/huggingface/lerobot/issues/304#issuecomment-2221560936, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJFDSL3NMJK2265XNQKPZDDZLWSWDAVCNFSM6AAAAABKL436HWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGU3DAOJTGY. You are receiving this because you were mentioned.Message ID: @.***>
It would be even better if there were tutorials on how to train using custom data from the gym simulation environment.
Hey there, we are still working on a simplification of the dataset class + upload to hub + tutorial! :) Sorry if it's taking some time!
Unfortunately, the only option now is to get familiar with: https://github.com/huggingface/lerobot/blob/main/lerobot/scripts/push_dataset_to_hub.py
See some example commands in header. You can eventually adapt one of them to your dataset format.
If you have issue understanding the code, reach out to us on discord #help channel
Closing this as we now have a tutorial to easily record and push your own datasets. Feel free to reopen if need be ;)
As said in the tutorial “If you don't want to push to hub, use --push-to-hub 0.”, where "--push-to-hub 0" should be use? Replace "--repo-id ${HF_USER}/koch_test " with "--push-to-hub 0"?
Also, after reading the tutorial I still feel puzzled how make a dataset with image observations, actions, and use them to run lerobot.
It seems to me that a useful example would be as ACT, where the dataset is saved in a clear way.
Thanks and best regards!
--push-to-hub 0 is an option of the lerobot/scripts/control_robot.py script. This is simply to deactivate uploading your dataset to the hub when using the record function.
As for the tutorial, it teaches you — amongst other things — how to record a LeRobotDataset with the Koch v1.1 arm (although it can be adapted to other robots, we are working on it).
It seems to me that a useful example would be as ACT, where the dataset is saved in a clear way.
Could you elaborate? What other scenario do you have in mind?
Hi, thanks for your attention to this matter. I am using RLbench dataset now. I have raw data now, containing image observations, actions. how can I organize them and transfer them to the hf dataset?
Get Outlook for iOShttps://aka.ms/o0ukef …
Hi, have you figure out how to combine RLBench and Lerobot and organize data to lerobot dataset? Thanks
Closing this as we now have a tutorial to easily record and push your own datasets. Feel free to reopen if need be ;)
Hi! It seems that this file is missing. I'm still looking for a tutorial to guide how to record a custom dataset with robots out of officially supported ones (franka panda to be specifically). Could you please suggest some material?
Closing this as we now have a tutorial to easily record and push your own datasets. Feel free to reopen if need be ;)
Hi! It seems that this file is missing. I'm still looking for a tutorial to guide how to record a custom dataset with robots out of officially supported ones (franka panda to be specifically). Could you please suggest some material?
Hello @FANG-Zhiwei,
I'm not sure if the documents Record a dataset and Bring Your Own Hardware are what you're looking for, but you can check them out.
Hope this helps!
I havent tried yet, but this openpi code may help https://github.com/Physical-Intelligence/openpi/blob/main/examples/libero/convert_libero_data_to_lerobot.py#L46-L93
@TheArtificialOutsider @zwbx @RochMollero Got it! We will address this issue very soon, and simplify stuff ;)
Any chance you could provide a very short sample of the datasets in the comment?
In the meantime, a few pointers and ressources:
README:
- https://github.com/huggingface/lerobot?tab=readme-ov-file#the-lerobotdataset-format
See how we use
from_preloaded:
lerobot/lerobot/scripts/control_robot.py
Lines 336 to 344 in [01f8ced](/huggingface/lerobot/commit/01f8cede0b5f1c16330205b35f4391939e11cb3e) lerobot_dataset = LeRobotDataset.from_preloaded( repo_id=repo_id, hf_dataset=hf_dataset, episode_data_index=episode_data_index, info=info, videos_dir=videos_dir, ) stats = compute_stats(lerobot_dataset) if run_compute_stats else {} lerobot_dataset.stats = stats(see
dataset.stats = statsto use it directly after)https://github.com/huggingface/lerobot/blob/main/lerobot/scripts/push_dataset_to_hub.py#L247-L258
See the content of these files to instantiate the hf_dataset, encode the videos, or store frames, etc.
- from hdf5: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/aloha_hdf5_format.py#L209-L211
- from zarr: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/pusht_zarr_format.py#L257-L259
- from parquet: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/dora_parquet_format.py#L213-L215
- from pickle: https://github.com/huggingface/lerobot/blob/main/lerobot/common/datasets/push_dataset_to_hub/xarm_pkl_format.py#L176-L178
these data format convert scripts are not available now. Can you give a new url for these?