vision torchvision's `ffmpeg` build has version incompatibility with `openh264`

🚀 The feature

I want pytorch/torchvision's ffmpeg build to support h264 coding.

This, perhaps, could be achieved by bumping up the openh264 version (assuming it will fix the error, see below) or compiling it with libx264.

Motivation, pitch

I am working with videos on a daily basis and use ffmpeg a lot. I like using conda environments and installing torch/torchvision in them. torchvision comes with ffmpeg that I would like to use in CLI as well. This would allow sharing a single environment for projects that require ffmpeg in pre-processing.

Alternatives

Transcoding can be done by relying on another ffmpeg installation on a machine by turning off the environment with torchvision.

Additional context

One may think that the torchvision's ffmpeg is built with libopenh264 (--enable-libopenh264) and transcoding to h264 should be supported. However, when I attempt to transcode a video with ffmpeg as a CLI, it fails. Here is an MWE (a video):

ffmpeg -y -i hammer.mp4 -vcodec h264 new_hammer.mp4
[...]
[libopenh264 @ 0x55923c8f86c0] Incorrect library version loaded
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
[...]

I tried to install libx264 in the conda environment but it didn't work (because ffmpeg was not compiled with it?).

I also tried to do read_video() --> write_video(, video_codec='libx264') but it didn't fail:

# pip install av==8.1.0
import torchvision
rgb, audio, meta = torchvision.io.read_video('./hammer.mp4')
vfps = meta['video_fps']
afps = meta['audio_fps']
video_codec = 'libx264'
torchvision.io.write_video('./tv_saved_hammer.mp4', rgb, vfps, video_codec,
                           audio_array=audio, audio_fps=afps, audio_codec='aac')
# './tv_saved_hammer.mp4' is `h264`.

Interesting, why?

I also tried to update openh264 to the latest version by conda install openh264=2.3.1 -c conda-forge but conda downgrades torchvision to 0.13.1 from the main conda channel and remove ffmpeg. Doing this with --no_deps makes ffmpeg fails with libopenh264.so.5: cannot open shared object file: No such file or directory.

Versions:

PyTorch version: 2.0.0
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Libc version: glibc-2.27

Python version: 3.8.16 (default, Mar  2 2023, 03:21:46)  [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-144-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce RTX 2080 Ti

Nvidia driver version: 525.89.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):           1
NUMA node(s):        1
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               8
Model name:          AMD Ryzen Threadripper 2950X 16-Core Processor
Stepping:            2
CPU MHz:             1790.173
CPU max MHz:         3500.0000
CPU min MHz:         2200.0000
BogoMIPS:            6985.72
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           64K
L2 cache:            512K
L3 cache:            8192K
NUMA node0 CPU(s):   0-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc 
rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma 
cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic 
cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core 
perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 
smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf 
xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists 
pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

Versions of relevant libraries:
[pip3] numpy==1.23.5
[pip3] torch==2.0.0
[pip3] torchaudio==2.0.0
[pip3] torchvision==0.15.0
[pip3] triton==2.0.0
[conda] blas                      1.0                         mkl  
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py38h7f8727e_0  
[conda] mkl_fft                   1.3.1            py38hd3c417c_0  
[conda] mkl_random                1.2.2            py38h51133e4_0  
[conda] numpy                     1.23.5           py38h14f4228_0  
[conda] numpy-base                1.23.5           py38h31eccc5_0  
[conda] pytorch                   2.0.0           py3.8_cuda11.8_cudnn8.7.0_0    pytorch
[conda] pytorch-cuda              11.8                 h7e8668a_3    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torchaudio                2.0.0                py38_cu118    pytorch
[conda] torchtriton               2.0.0                      py38    pytorch
[conda] torchvision               0.15.0               py38_cu118    pytorch

Also

openh264                  2.1.1                h4ff587b_0
ffmpeg                    4.3                  hf484d3e_0    pytorch

ffmpeg header:

ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
  configuration: --prefix=/opt/conda/conda-bld/ffmpeg_1597178665428/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh 
--cc=/opt/conda/conda-bld/ffmpeg_1597178665428/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc 
--disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-hardcoded-tables 
--enable-libfreetype --enable-libopenh264 --enable-pic --enable-pthreads --enable-shared 
--disable-static --enable-version3 --enable-zlib --enable-libmp3lame
  libavutil      56. 51.100 / 56. 51.100
  libavcodec     58. 91.100 / 58. 91.100
  libavformat    58. 45.100 / 58. 45.100
  libavdevice    58. 10.100 / 58. 10.100
  libavfilter     7. 85.100 /  7. 85.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  7.100 /  5.  7.100
  libswresample   3.  7.100 /  3.  7.100
Hyper fast Audio and Video encoder

Apr 10 '23 07:04 v-iashin

I encountered the same problem on my mamba envs. It appears that I have a hack to get everything working togheter, but it isnt nice...

ffmpeg appears to work sometimes if after i install it, I install torchvision without the -y option sometimes it seems to work, but never on docker images, where I run.

RUN conda config --add channels pytorch && conda config --set channel_priority strict \
    && mamba install -y pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

I set channel_priority strict since otherwise some of the pytorch packages ( i think pytorch 3d ) and some github repos overwrite the cuda enabled pytorch with pytorch w/o cuda from the forge channel

I built a series of docker images trying to install ffmpeg in all sorts of ways.
I first install ffmpeg from apt on ubuntu 22.04, which uses 4.4.2-0ubuntu0.22.04.1 which works well

I then
...--set the channel_priority flexible && mamba install -y ffmpeg it installs ffmpeg 5.1.2 Which from everything I tried I works well.

But if I install torchvision after this, ffmpeg reverts to ffmpeg 4.3 with its broken lib... which I cant remove without removing torchvision.

I can force installing conda config --set channel_priority flexible && mamba install ffmpeg=5.1.2 But even if it does not mention torchvision as one of the changes...

>>> import torchvsion
 ModuleNotFoundError: No module named 'torchvsion`

I can finally make it work when I pip install torchvision after ! but thats quite a hacky ! Dockerfile installing torchvision twice and ffmpeg twice...

Apr 26 '23 18:04 xvdp

Actually... it can be made to work together quite simply by installing in this order...

RUN conda config --add channels pytorch && conda config --set channel_priority flexible \
    && source activate && mamba init && mamba install -y  ffmpeg 

RUN conda config --set channel_priority strict \
    && mamba install -y pytorch torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

RUN pip install --upgrade pip && pip install torchvision

I just did the pytorch and torchvision steps manually on top of an image with ffmpeg -

Corrections to the comment:

ffmpeg can be made to work, but torchvision on pypi seems compiled for cuda11.7 and latest pytorch on conda is built to cuda 11.8 (which I need for a downstream project).
But finally it seem to work. git clone ... and python setup.py install - appears to compile against the correct cuda versiuon.

Apr 26 '23 19:04 xvdp

I get the same problem. Solved it by overwriting the torchvison ffmpeg by the system bundled version of ffmpeg.

Check the ffmpeg version, for me it's ffmpeg version 4.3

ffmpeg

Install the apt version of ffmpeg:

sudo apt install ffmpeg

Find the location of conda version ffmpeg, in my case it's /opt/conda/bin/ffmpeg:

which ffmpeg

Remove the conda ffmpeg:

rm /opt/conda/bin/ffmpeg

Find the apt version ffmpeg, in my case it's /usr/bin/ffmpeg:

dpkg -L ffmpeg

Link the apt version ffmpeg to conda's:

ln -s /usr/bin/ffmpeg /opt/conda/bin/ffmpeg

Check the verison of ffmpeg by typing ffmpeg, for me it's ffmpeg version 3.4.11-0ubuntu0.1 now.

Nov 09 '23 03:11 kexul

vision vision copied to clipboard

torchvision's `ffmpeg` build has version incompatibility with `openh264`

🚀 The feature

Motivation, pitch

Alternatives

Additional context

vision
vision copied to clipboard