lerobot fix(deps): constrain PyAV version to resolve OpenCV-python ffmpeg version conflict

During the installation of lerobot, we resolve to the following dependency versions:

opencv-python == 4.11.0.86
av == 14.2.0
torchvision == 0.21.0

PyAV resolves to version 14.2.0, which relies on the latest version of ffmpeg.
OpenCV-python, however, includes its own bundled versions of PyAV and ffmpeg.
Other dependencies in lerobot, such as torchvision, depend on PyAV. This leads to a runtime error due to incompatible ffmpeg versions.

In summary, the current dependency setup makes it impossible for the resolved versions of PyAV and OpenCV-python to work together. This has caused multiple runtime errors, resulting in several GitHub issues and community-proposed workarounds. However, these solutions often introduce new issues elsewhere in the codebase. Relevant discussions and issues can be found here:

https://github.com/huggingface/lerobot/pull/757
https://github.com/huggingface/lerobot/issues/679
https://github.com/huggingface/lerobot/issues/742
https://github.com/huggingface/lerobot/pull/519

Additionally, some related discussions are ongoing in the respective projects:

https://github.com/pytorch/vision/issues/5940
https://github.com/PyAV-Org/PyAV/issues/978
https://github.com/opencv/opencv/issues/21952

After analyzing the situation, I’ve identified three potential solutions:

Use OpenCV-headless: This avoids the ffmpeg dependency and resolves the conflict. However, since we rely on imshow() and other GUI functionalities for debugging and examples, this option is not feasible.
Find compatible versions of OpenCV-python and PyAV: Identify versions of both libraries that use the same ffmpeg version, eliminating the conflict. This is the goal of the current PR.
Manually manage dependencies:
- Allow the dependency manager to resolve the initial versions.
- Install PyAV's ffmpeg dependencies (libavcodec-dev libavformat-dev libavdevice-dev libavfilter-dev libavutil-dev libswresample-dev libswscale-dev).
- Reinstall PyAV using: pip install av --no-binary av --no-cache --force-reinstall, forcing it to build against the ffmpeg version already in the environment (e.g., the one bundled with OpenCV-python).
- However, reinstalling this way will attempt to use and build the latest PyAV version, which will fail because of not having the required ffmpeg version.
- If we want to build PyAV, we need to specify a version that can be compiled against the ffmpeg version from OpenCV. As of today, this version is av>=12.3.0,<13.0.0.
- But if this version can be built successfully, we can then skip the manual build process and simply specify this requirement in pyproject.toml, effectively aligning with solution 2. This is the change proposed in this PR.

The changes in this PR have passed the nightly and test CI workflows and have been verified on Linux using both conda and uv as package managers. However, this solution will only remain viable as long as the OpenCV-python version resolved during installation continues to use a ffmpeg version compatible with PyAV >=12.3.0,<13.0.0.

When OpenCV-python eventually updates its bundled ffmpeg version, we can relax the PyAV dependency constraint and specify a version range that aligns with the new ffmpeg version used by OpenCV.

Mar 20 '25 17:03 imstevenpmwork

About testing these changes, installation is not the issue, it's the opencv's imshow() calls. Did you test that with real hardware? In the general case, I really prefer to not cap dep versions. Could be okay for now but must not remain in the release

So I think the better solution to this issue is:

Remove pyav completely once torchcodec is distributed more easily and we're confident it's as easy to install.
Most importantly, remove cv2.imshow() calls. This is used in the teleop/dataset recording use case. Instead, we should either:
- build a simple flask app and stream the images there for display. It would look a bit like the current visualize_dataset_html script
- build a rerun app for real-time visualization (might be prettier)

I'm happy with both, as long as they fit our needs. Wanna start working on it?

Mar 21 '25 08:03 aliberts

To investigate the issue, I conducted tests using:

A simple dummy app designed to reproduce the error easily:

import numpy as np
import cv2
import av

cv2.imshow("debug", np.zeros((128,128,3), dtype=np.uint8))
cv2.waitKey(0)

The testing scripts (which also use imshow()).

In both cases, the process would hang, and the CPU usage would spike to 100%. However, after applying the fix, both scenarios ran smoothly without any issues.

I completely agree that having imshow() in our codebase isn’t ideal. As I mentioned in my initial proposed solutions, it’s primarily used for debugging or examples purposes. Moving forward, I’d like to transition to using OpenCV in headless mode to avoid such dependencies.

For now, I believe capping this dependency is a reasonable solution for our current needs. This approach is not only recommended by OpenCV developers (as seen in this GitHub comment), but it also aligns with the fact that these two projects operate independently. We can work towards transitioning to headless mode and removing imshow() when it becomes a priority closer to the release. However, I think this fix will provide a better experience for users today—especially since we receive a new issue related to this problem almost every week.

@Cadene What are your thoughts?

Mar 21 '25 09:03 imstevenpmwork

Just tested with a fresh conda environment, unfortunately cv2.imshow() still hangs for me:

Ubuntu 24.04.2
opencv-python == 4.11.0.86
av == 12.3.0
torchvision == 0.21.0

Doing it the old way

conda install -y -c conda-forge ffmpeg
pip uninstall -y opencv-python
conda install -y -c conda-forge "opencv>=4.10.0"

followed by

conda install -c conda-forge jpeg libtiff

works: the imshow window appears and python lerobot/scripts/control_robot.py --robot.type=so100 --control.type=teleoperate runs as expected.

Mar 26 '25 05:03 kuz

Hello @kuz,

Thanks for reporting this! The documentation needed an update, and your report helped us spot that.

Could you do me a favor and test the following in a fresh conda environment and report if it runs as expected?

conda create -y -n lerobot python=3.10  
conda activate lerobot  
conda install ffmpeg  
pip install --no-binary=av -e .

I’ll open a PR to get this updated 😄 Here it is: https://github.com/huggingface/lerobot/pull/907

Mar 26 '25 09:03 imstevenpmwork

lerobot lerobot copied to clipboard

fix(deps): constrain PyAV version to resolve OpenCV-python ffmpeg version conflict

lerobot
lerobot copied to clipboard