lerobot icon indicating copy to clipboard operation
lerobot copied to clipboard

Add torchcodec cpu

Open jadechoghari opened this issue 9 months ago • 1 comments

What this does

This PR replaces torchvision CPU decoding by torchcodec CPU decoding. Also added a decode_video_frames function that wraps multiple backends, instead of calling decode_video_frames_BACKENDNAME separately. This makes it more efficient and allows us to add more decoders later on!

The decoder used is decided based on the dataset.video_backend key, but defaults to torchcodec.

How it was tested

Test and Benchmark the decoders on different datasets/policies.

How to checkout & try? (for the reviewer)

Just run the training script, with a dataset containing videos to decode. example:

python lerobot/scripts/train.py \
    --output_dir=outputs/train/act_aloha_insertion \
    --policy.type=act \
    --dataset.repo_id=lerobot/aloha_sim_insertion_human \
    --env.type=aloha \
    --env.task=AlohaInsertion-v0 \

Benchmarks

Ran one benchmark on lerobot/aloha_sim_insertion_human_image dataset Comparison: PyAV vs TorchCodec (CPU)

Metric PyAV TorchCodec-CPU
Video to Images Load Time Ratio 1.87 1.25
Avg MSE 5.14e-05 4.88e-05
Avg PSNR 43.17 43.37
Avg SSIM 0.995 0.995

What's left

~~Remove/suppress libdav1d logs (they're noisy) -> there's no env variable to disable those for now but they'll be deactivated in the next version of torchcodec.~~

PR is in a good state ✅

jadechoghari avatar Mar 03 '25 06:03 jadechoghari

Torchcodec consistently outperforms pyav across all datasets and video codecs (encoders), it achieves lower MSE (better accuracy), higher PSNR (better quality), and higher SSIM (better perceptual similarity). this trend is evident across libsvtav1, libx264, and libx265, and it makes torchcodec the superior choice for both efficiency and quality. To reproduce the full results, check this link

jadechoghari avatar Mar 08 '25 08:03 jadechoghari

great!, i guess cc @imstevenpmwork

jadechoghari avatar Mar 14 '25 14:03 jadechoghari

Hello @jadechoghari, thanks for your contribution! This LGTM 😄

imstevenpmwork avatar Mar 14 '25 15:03 imstevenpmwork