Fails to build with ffmpeg 8
🐛 Describe the bug
Attempting to rebuild our torchvision packages at Arch Linux against ffmpeg 8 fails with:
/build/torchvision/src/python-vision-0.23.0/torchvision/csrc/io/decoder/video_stream.cpp: In member function ‘virtual void ffmpeg::VideoStream::setHeader(ffmpeg::DecoderHea
der*, bool)’:
/build/torchvision/src/python-vision-0.23.0/torchvision/csrc/io/decoder/video_stream.cpp:125:32: error: ‘AVFrame’ {aka ‘struct AVFrame’} has no member named ‘key_frame’
125 | header->keyFrame = frame_->key_frame;
| ^~~~~~~~~
Versions
0.23.0 but does not seemed to be fixed on master either
From what I can tell, this doesn't seem to be addressed by the pending PR https://github.com/pytorch/vision/pull/9231.
Hi @lfos , we have deprecated the torchvision C++ decoder part, and from 0.24 which we'll release in a few weeks, it will stop being built by default.
@NicolasHug - Thanks for the update! As it's still a few weeks, is there any change we can cherry-pick, so 0.23.0 can be built against ffmpeg 8? For context, torchvision is one of the last remaining packages blocking our ffmpeg 8 rebuild at Arch Linux.
[1] https://archlinux.org/todo/ffmpeg-80-and-friends/
FWIW, the following patch seems to fix the build:
--- a/torchvision/csrc/io/decoder/video_stream.cpp 2025-09-29 09:28:57.678370404 -0400
+++ b/torchvision/csrc/io/decoder/video_stream.cpp 2025-09-29 09:29:51.544541164 -0400
@@ -122,7 +122,7 @@ int VideoStream::copyFrameBytes(ByteStor
void VideoStream::setHeader(DecoderHeader* header, bool flush) {
Stream::setHeader(header, flush);
if (!flush) { // no frames for video flush
- header->keyFrame = frame_->key_frame;
+ header->keyFrame = !!(frame_->flags & AV_FRAME_FLAG_KEY);
header->fps = av_q2d(av_guess_frame_rate(
inputCtx_, inputCtx_->streams[format_.stream], nullptr));
}
"Hi, I'm a student contributor and I'm interested in working on this issue. I've reviewed the current implementation of FiveCrop and TenCrop, and understand that we need to add tensor support while maintaining backward compatibility.
My plan is to:
Add tensor support in functional.py for five_crop and ten_crop
Update the FiveCrop and TenCrop transform classes to handle both PIL and tensor inputs
Add comprehensive tests for both backends
I notice that some similar transforms use _get_image_size and _get_image_num_channels for backend detection. Would this be the right approach here too?
Looking forward to your guidance before I start implementation."