lerobot icon indicating copy to clipboard operation
lerobot copied to clipboard

make --dataset.episodes work

Open paszea opened this issue 10 months ago • 2 comments

What this does

There was an "index out of range error" when --dataset.episodes is specified with train.py. This change fixes it.

Details: The original code shortens the length of episode_data_index array (computed by get_episode_data_index()) to the number of episodes specified. But the episode no is used to access the episode-data-index. For example, specifying --dataset.episodes='[10]' makes episode_data_index contain only one element. But later in code episode-data-index[10] is called to access data for episode 10, causing index-out-of-range.

How it was tested

Tested with training a model with the change.

Examples: python lerobot/scripts/train.py
... --dataset.episodes='[40,41,42,43]'
...

How to checkout & try? (for the reviewer)

Try training with --dataset.episodes specified.

paszea avatar May 02 '25 02:05 paszea

The previous cl broke some tests. I made a fix. Can you take another look and also prove the workflows to see if it's fine now.

paszea avatar May 03 '25 20:05 paszea

I was facing this issue., Thanks for the patch. Any updates on merging this to main?

mohitydv09 avatar Jun 18 '25 20:06 mohitydv09

This was a lifesaver when I was filtering out some bad episodes during training - thank you and I hope some one resolves/approves this PR!

jloganolson avatar Jul 14 '25 17:07 jloganolson