vision icon indicating copy to clipboard operation
vision copied to clipboard

Fails to build with ffmpeg 8

Open lfos opened this issue 3 months ago • 5 comments

🐛 Describe the bug

Attempting to rebuild our torchvision packages at Arch Linux against ffmpeg 8 fails with:

/build/torchvision/src/python-vision-0.23.0/torchvision/csrc/io/decoder/video_stream.cpp: In member function ‘virtual void ffmpeg::VideoStream::setHeader(ffmpeg::DecoderHea
der*, bool)’:
/build/torchvision/src/python-vision-0.23.0/torchvision/csrc/io/decoder/video_stream.cpp:125:32: error: ‘AVFrame’ {aka ‘struct AVFrame’} has no member named ‘key_frame’
  125 |     header->keyFrame = frame_->key_frame;
      |                                ^~~~~~~~~

Versions

0.23.0 but does not seemed to be fixed on master either

lfos avatar Sep 26 '25 18:09 lfos

From what I can tell, this doesn't seem to be addressed by the pending PR https://github.com/pytorch/vision/pull/9231.

lfos avatar Sep 26 '25 18:09 lfos

Hi @lfos , we have deprecated the torchvision C++ decoder part, and from 0.24 which we'll release in a few weeks, it will stop being built by default.

NicolasHug avatar Sep 29 '25 08:09 NicolasHug

@NicolasHug - Thanks for the update! As it's still a few weeks, is there any change we can cherry-pick, so 0.23.0 can be built against ffmpeg 8? For context, torchvision is one of the last remaining packages blocking our ffmpeg 8 rebuild at Arch Linux.

[1] https://archlinux.org/todo/ffmpeg-80-and-friends/

lfos avatar Sep 29 '25 12:09 lfos

FWIW, the following patch seems to fix the build:

--- a/torchvision/csrc/io/decoder/video_stream.cpp	2025-09-29 09:28:57.678370404 -0400
+++ b/torchvision/csrc/io/decoder/video_stream.cpp	2025-09-29 09:29:51.544541164 -0400
@@ -122,7 +122,7 @@ int VideoStream::copyFrameBytes(ByteStor
 void VideoStream::setHeader(DecoderHeader* header, bool flush) {
   Stream::setHeader(header, flush);
   if (!flush) { // no frames for video flush
-    header->keyFrame = frame_->key_frame;
+    header->keyFrame = !!(frame_->flags & AV_FRAME_FLAG_KEY);
     header->fps = av_q2d(av_guess_frame_rate(
         inputCtx_, inputCtx_->streams[format_.stream], nullptr));
   }

lfos avatar Sep 29 '25 14:09 lfos

"Hi, I'm a student contributor and I'm interested in working on this issue. I've reviewed the current implementation of FiveCrop and TenCrop, and understand that we need to add tensor support while maintaining backward compatibility.

My plan is to:

Add tensor support in functional.py for five_crop and ten_crop

Update the FiveCrop and TenCrop transform classes to handle both PIL and tensor inputs

Add comprehensive tests for both backends

I notice that some similar transforms use _get_image_size and _get_image_num_channels for backend detection. Would this be the right approach here too?

Looking forward to your guidance before I start implementation."

Laughter-cx avatar Nov 10 '25 02:11 Laughter-cx