rpi-ffmpeg icon indicating copy to clipboard operation
rpi-ffmpeg copied to clipboard

v4l2_m2m_dec: zero passed as width/height

Open valpackett opened this issue 1 month ago • 9 comments

Using the test/7.1.1/main branch (rev 857f6c0ab47578dbd4153b4ed41eefbd488fd7fe), playback does not work as the iris driver rejects zero size:

[pid    20] openat(AT_FDCWD, "/dev/video0", O_RDWR|O_NONBLOCK) = 21
[pid    20] ioctl(21, VIDIOC_QUERYCAP, {driver="iris_driver", card="Iris Decoder", bus_info="platform:aa00000.video-codec", version=KERNEL_VERSION(6, 17, 0), capabilities=V4L2_CAP_VIDEO_M2M_MPLANE|V4L2_CAP_EXT_PIX_FORMAT|V4L2_CAP_STREAMING|V4L2_CAP_DEVICE_CAPS, device_caps=V4L2_CAP_VIDEO_M2M_MPLANE|V4L2_CAP_EXT_PIX_FORMAT|V4L2_CAP_STREAMING}) = 0
[pid    20] write(2, "\33[48;5;0m\33[38;5;39m[h264_v4l2m2m"..., 55[h264_v4l2m2m @ 0xffff38067960] ) = 55
[pid    20] write(2, "driver 'iris_driver' on card 'Ir"..., 59driver 'iris_driver' on card 'Iris Decoder' in mplane mode
) = 59
[pid    20] write(2, "\33[48;5;0m\33[38;5;39m[h264_v4l2m2m"..., 55[h264_v4l2m2m @ 0xffff38067960] ) = 55
[pid    20] write(2, "requesting formats: output=H264/"..., 55requesting formats: output=H264/none capture=NV12/none
) = 55
[pid    20] ioctl(21, VIDIOC_S_FMT, {type=V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, fmt.pix_mp={width=0, height=0, pixelformat=v4l2_fourcc('H', '2', '6', '4') /* V4L2_PIX_FMT_H264 */, field=V4L2_FIELD_NONE, colorspace=V4L2_COLORSPACE_DEFAULT, plane_fmt=[{sizeimage=128, bytesperline=0}], num_planes=1}} => {fmt.pix_mp={width=0, height=0, pixelformat=v4l2_fourcc('H', '2', '6', '4') /* V4L2_PIX_FMT_H264 */, field=V4L2_FIELD_NONE, colorspace=V4L2_COLORSPACE_DEFAULT, plane_fmt=[{sizeimage=7077888, bytesperline=0}], num_planes=1}}) = 0
[pid    20] ioctl(21, VIDIOC_S_FMT, {type=V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE, fmt.pix_mp={width=0, height=0, pixelformat=v4l2_fourcc('N', 'V', '1', '2') /* V4L2_PIX_FMT_NV12 */, field=V4L2_FIELD_NONE, colorspace=V4L2_COLORSPACE_DEFAULT, plane_fmt=[{sizeimage=0, bytesperline=0}], num_planes=1}} => {fmt.pix_mp={width=0, height=0, pixelformat=v4l2_fourcc('N', 'V', '1', '2') /* V4L2_PIX_FMT_NV12 */, field=V4L2_FIELD_NONE, colorspace=V4L2_COLORSPACE_DEFAULT, plane_fmt=[{sizeimage=0, bytesperline=0}], num_planes=1}}) = 0
[pid    20] ioctl(21, VIDIOC_G_FMT, {type=V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, fmt.pix_mp={width=0, height=0, pixelformat=v4l2_fourcc('H', '2', '6', '4') /* V4L2_PIX_FMT_H264 */, field=V4L2_FIELD_NONE, colorspace=V4L2_COLORSPACE_DEFAULT, plane_fmt=[{sizeimage=7077888, bytesperline=0}], num_planes=1}}) = 0
[pid    20] ioctl(21, VIDIOC_QUERYCTRL, {id=V4L2_CID_MIN_BUFFERS_FOR_OUTPUT}) = -1 EINVAL (Invalid argument)
[pid    20] ioctl(21, VIDIOC_REQBUFS, {type=V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, memory=V4L2_MEMORY_MMAP, count=16}) = -1 EINVAL (Invalid argument)
[pid    20] write(2, "\33[48;5;0m\33[38;5;39m[h264_v4l2m2m"..., 55[h264_v4l2m2m @ 0xffff38067960] ) = 55
[pid    20] write(2, "\33[48;5;0m\33[38;5;196moutput VIDIO"..., 71output VIDIOC_REQBUFS failed: Invalid argument
) = 71
[pid    20] write(2, "\33[48;5;0m\33[38;5;39m[h264_v4l2m2m"..., 55[h264_v4l2m2m @ 0xffff38067960] ) = 55
[pid    20] write(2, "\33[48;5;0m\33[38;5;196mno v4l2 outp"..., 57no v4l2 output context's buffers
) = 57
[pid    20] close(21)                   = 0
[pid    20] write(2, "\33[48;5;0m\33[38;5;39m[h264_v4l2m2m"..., 55[h264_v4l2m2m @ 0xffff38067960] ) = 55
[pid    20] write(2, "\33[48;5;0m\33[38;5;196mcan't config"..., 48can't configure decoder
) = 48

..how did it set sizeimage=7077888 at the same time as width=0, height=0 when the code sets both based on the same w/h?? 0.o

valpackett avatar Oct 11 '25 09:10 valpackett

Figured out a patch that makes it work:

diff --git a/libavcodec/v4l2_context.c b/libavcodec/v4l2_context.c
index e20e3e485c..1fdedb1686 100644
--- a/libavcodec/v4l2_context.c
+++ b/libavcodec/v4l2_context.c
@@ -283,11 +283,17 @@
 
 static inline void v4l2_save_to_context(V4L2Context* ctx, struct v4l2_format_update *fmt)
 {
+    V4L2m2mContext * const s = ctx_to_m2mctx(ctx);
     ctx->format.type = ctx->type;
 
     if (fmt->update_avfmt)
         ctx->av_pix_fmt = fmt->av_fmt;
 
+    if (ctx->height == 0 || ctx->width == 0) {
+        ctx->width = s->avctx->width;
+        ctx->height = s->avctx->height;
+    }
+
     if (V4L2_TYPE_IS_MULTIPLANAR(ctx->type)) {
         /* update the sizes to handle the reconfiguration of the capture stream at runtime */
         ctx->format.fmt.pix_mp.height = ctx->height;

Probably not the right place to do it (?) but someone should actually set the dimensions on the v4l2 ctx early..

valpackett avatar Oct 12 '25 09:10 valpackett

Well your iris driver is in the wrong here, as V4L2 very clearly says that you should be able to pass anything into S_FMT and it should set the returned values to something valid. Having said that I'll have a look.

jc-kynesim avatar Oct 12 '25 09:10 jc-kynesim

S_FMT didn't complain.. it's REQBUFS that did, as we got to the actual-buffer-allocation part still with zeroes in the context and it doesn't really make sense to allocate nothing (?)

valpackett avatar Oct 12 '25 09:10 valpackett

S_FMT should not only not complain it should return valid values in the structure passed to it which will then be used in the REQBUF. Also having said that the buffers being allocated are for the coded bitstream (remember in V4L2 speak OUTPUT=source, CAPTURE=destination); width/height doesn't really mean a lot here but buffer size does. As I said I'll have a look but this code does work on at least Pi & Cedrus.

jc-kynesim avatar Oct 12 '25 09:10 jc-kynesim

Read https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-decoder.html. I do pretty much what it says the bit you are looking at is 4.5.1.5 where it states width, height coded resolution of the stream; required only if it cannot be parsed from the stream for the given coded format; otherwise the decoder will use this resolution as a placeholder resolution that will likely change as soon as it can parse the actual coded resolution from the stream. They can (and should) be parsed so I set to 0

jc-kynesim avatar Oct 12 '25 09:10 jc-kynesim

But having said all that, I'm happy enough to add a kludge for Iris if it all works once you've added your patch. I have quirks for meson-vdec, I can add one for iris_driver.

jc-kynesim avatar Oct 12 '25 09:10 jc-kynesim

mm. There is some placeholder resolution being set there, with my extra logging statements in the driver and without the kludge here I see:

[117275.755166] qcom-iris aa00000.video-codec: output_mplane yes fmt
[117275.755173] qcom-iris aa00000.video-codec: DST sz 384x256 - SRC sz 320x240
[117275.755175] qcom-iris aa00000.video-codec: capture_mplane nv12
[117275.755194] qcom-iris aa00000.video-codec: output_mplane yes fmt
[117275.755196] qcom-iris aa00000.video-codec: DST sz 0x0 - SRC sz 0x0
[117275.755197] qcom-iris aa00000.video-codec: capture_mplane nv12
[117275.755203] qcom-iris aa00000.video-codec: width: 0 min: 96 max: 8192
                height: 0 min: 96 max: 8192

the logic in the driver does seem strange, and venus (older generation qcom) did also work with rpi-ffmpeg. I'll try to fix it now that I understand what's actually expected there..

UPD: proposed https://lore.kernel.org/all/[email protected]/

valpackett avatar Oct 12 '25 17:10 valpackett

I'm not claiming that the ffmpeg startup logic is perfect, I rewrote quite a bit of the upstream V4L2 codec but left it within the broader ffmpeg V4L2 framework and the fit isn't always great especially around startup, but I do believe that it is conformant to the spec.

jc-kynesim avatar Oct 13 '25 09:10 jc-kynesim

In the init phase, the codec does not alloc any CAPTURE (destination) buffers and so doesn't need to know the size (obviously it needs CAPTURE (source) buffers). It is only after the driver signals the frame size via the resolution changed event that codec allocs some destination buffers. It is a little convoluted but makes sense if you think about it; given that this is a stateful driver it, not the codec, is responsible for parsing the SPS/PPS so the codec doesn't know the size of the decoded frame until after decode has already started. It gets to know this via the resolution changed event which causes the codec to actually alloc some capture buffers now it knows the correct size.

In many (but not all) cases in ffmpeg the size has been parsed elsewhere by init time so it is possible to guess correctly, but given that the codec needs to cope with the case where the size isn't known beforehand it might as well always use that route.

jc-kynesim avatar Oct 13 '25 09:10 jc-kynesim