vision
vision copied to clipboard
torchvision's `ffmpeg` build has version incompatibility with `openh264`
🚀 The feature
I want pytorch/torchvision's ffmpeg
build to support h264
coding.
This, perhaps, could be achieved by bumping up the openh264
version (assuming it will fix the error, see below) or compiling it with libx264
.
Motivation, pitch
I am working with videos on a daily basis and use ffmpeg
a lot. I like using conda
environments and installing torch/torchvision
in them. torchvision
comes with ffmpeg
that I would like to use in CLI as well. This would allow sharing a single environment for projects that require ffmpeg
in pre-processing.
Alternatives
Transcoding can be done by relying on another ffmpeg
installation on a machine by turning off the environment with torchvision
.
Additional context
One may think that the torchvision
's ffmpeg
is built with libopenh264
(--enable-libopenh264
) and transcoding to h264
should be supported. However, when I attempt to transcode a video with ffmpeg
as a CLI, it fails.
Here is an MWE (a video):
ffmpeg -y -i hammer.mp4 -vcodec h264 new_hammer.mp4
[...]
[libopenh264 @ 0x55923c8f86c0] Incorrect library version loaded
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
[...]
I tried to install libx264
in the conda
environment but it didn't work (because ffmpeg
was not compiled with it?).
I also tried to do read_video() --> write_video(, video_codec='libx264')
but it didn't fail:
# pip install av==8.1.0
import torchvision
rgb, audio, meta = torchvision.io.read_video('./hammer.mp4')
vfps = meta['video_fps']
afps = meta['audio_fps']
video_codec = 'libx264'
torchvision.io.write_video('./tv_saved_hammer.mp4', rgb, vfps, video_codec,
audio_array=audio, audio_fps=afps, audio_codec='aac')
# './tv_saved_hammer.mp4' is `h264`.
Interesting, why?
I also tried to update openh264
to the latest version by conda install openh264=2.3.1 -c conda-forge
but conda
downgrades torchvision
to 0.13.1 from the main
conda
channel and remove ffmpeg
. Doing this with --no_deps
makes ffmpeg
fails with libopenh264.so.5: cannot open shared object file: No such file or directory
.
Versions:
PyTorch version: 2.0.0
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Libc version: glibc-2.27
Python version: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-144-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce RTX 2080 Ti
Nvidia driver version: 525.89.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 8
Model name: AMD Ryzen Threadripper 2950X 16-Core Processor
Stepping: 2
CPU MHz: 1790.173
CPU max MHz: 3500.0000
CPU min MHz: 2200.0000
BogoMIPS: 6985.72
Virtualization: AMD-V
L1d cache: 32K
L1i cache: 64K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc
rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma
cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic
cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core
perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2
smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf
xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists
pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
Versions of relevant libraries:
[pip3] numpy==1.23.5
[pip3] torch==2.0.0
[pip3] torchaudio==2.0.0
[pip3] torchvision==0.15.0
[pip3] triton==2.0.0
[conda] blas 1.0 mkl
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.23.5 py38h14f4228_0
[conda] numpy-base 1.23.5 py38h31eccc5_0
[conda] pytorch 2.0.0 py3.8_cuda11.8_cudnn8.7.0_0 pytorch
[conda] pytorch-cuda 11.8 h7e8668a_3 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 2.0.0 py38_cu118 pytorch
[conda] torchtriton 2.0.0 py38 pytorch
[conda] torchvision 0.15.0 py38_cu118 pytorch
Also
openh264 2.1.1 h4ff587b_0
ffmpeg 4.3 hf484d3e_0 pytorch
ffmpeg
header:
ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
configuration: --prefix=/opt/conda/conda-bld/ffmpeg_1597178665428/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh
--cc=/opt/conda/conda-bld/ffmpeg_1597178665428/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc
--disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-hardcoded-tables
--enable-libfreetype --enable-libopenh264 --enable-pic --enable-pthreads --enable-shared
--disable-static --enable-version3 --enable-zlib --enable-libmp3lame
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
Hyper fast Audio and Video encoder
I encountered the same problem on my mamba envs. It appears that I have a hack to get everything working togheter, but it isnt nice...
ffmpeg appears to work sometimes if after i install it, I install torchvision without the -y
option sometimes it seems to work, but never on docker images, where I run.
RUN conda config --add channels pytorch && conda config --set channel_priority strict \
&& mamba install -y pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
I set channel_priority strict since otherwise some of the pytorch packages ( i think pytorch 3d ) and some github repos overwrite the cuda enabled pytorch with pytorch w/o cuda from the forge channel
I built a series of docker images trying to install ffmpeg in all sorts of ways.
I first install ffmpeg from apt on ubuntu 22.04, which uses 4.4.2-0ubuntu0.22.04.1
which works well
I then
...--set the channel_priority flexible && mamba install -y ffmpeg
it installs ffmpeg 5.1.2 Which from everything I tried I works well.
But if I install torchvision after this, ffmpeg reverts to ffmpeg 4.3 with its broken lib... which I cant remove without removing torchvision.
I can force installing conda config --set channel_priority flexible && mamba install ffmpeg=5.1.2
But even if it does not mention torchvision as one of the changes...
>>> import torchvsion
ModuleNotFoundError: No module named 'torchvsion`
I can finally make it work when I pip install torchvision
after ! but thats quite a hacky ! Dockerfile installing torchvision twice and ffmpeg twice...
Actually... it can be made to work together quite simply by installing in this order...
RUN conda config --add channels pytorch && conda config --set channel_priority flexible \
&& source activate && mamba init && mamba install -y ffmpeg
RUN conda config --set channel_priority strict \
&& mamba install -y pytorch torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
RUN pip install --upgrade pip && pip install torchvision
I just did the pytorch and torchvision steps manually on top of an image with ffmpeg -
Corrections to the comment:
-
ffmpeg can be made to work, but torchvision on pypi seems compiled for cuda11.7 and latest pytorch on conda is built to cuda 11.8 (which I need for a downstream project).
-
But finally it seem to work. git clone ... and python setup.py install - appears to compile against the correct cuda versiuon.
I get the same problem. Solved it by overwriting the torchvison ffmpeg
by the system bundled version of ffmpeg
.
- Check the ffmpeg version, for me it's
ffmpeg version 4.3
ffmpeg
- Install the apt version of ffmpeg:
sudo apt install ffmpeg
- Find the location of conda version ffmpeg, in my case it's
/opt/conda/bin/ffmpeg
:
which ffmpeg
- Remove the conda ffmpeg:
rm /opt/conda/bin/ffmpeg
- Find the apt version ffmpeg, in my case it's
/usr/bin/ffmpeg
:
dpkg -L ffmpeg
- Link the apt version ffmpeg to conda's:
ln -s /usr/bin/ffmpeg /opt/conda/bin/ffmpeg
- Check the verison of ffmpeg by typing
ffmpeg
, for me it'sffmpeg version 3.4.11-0ubuntu0.1
now.