pivid
pivid copied to clipboard
"Slow update" and "No space left on device" eventually with multiple displays
Forum thread on "slow update", including multiple-display connection: https://forums.raspberrypi.com/viewtopic.php?t=332868
Forum thread on (what's probably) a different cause of "no space": https://forums.raspberrypi.com/viewtopic.php?t=333320
Observed: When doing nontrivial (maybe even trivial?) output to multiple displays, over time pivid starts complaining about "Slow update" events (KMS commit taking more than a frame). Over time this worsens until it starts reporting "No space left on device" (kernel rejects KMS commit), and visual artifacts become apparent and numerous.
The worsening appears to persist with the Pi in question (restarting pivid doesn't reset things), though it is reset (temporarily) by rebooting.
The easiest repro case is to run a multiple-output test like https://github.com/egnor/pivid/blob/main/test_media/jellyfish_2disp.json (from @aubilenon) and let it run for a few minutes; "Slow update" messages usually crop up after a loop or two.
Here's example from the simple jellyfish_2disp.json
script, running at 60hz updates to exacerbate the problem...
At the start of this example, both displays have just had had a vsync page flip, finishing their previous updates:
19:45:49.582 4774u D [display] HDMI-2 u191 flip! 19:45:49.578 (m58237.839)
19:45:49.582 74u D [display] HDMI-1 u192 flip! 19:45:49.582 (m58237.842)
Note that the vsync events are ~3ms apart-- the two outputs generally are not synchronized even with the same video mode. Since HDMI-1 has become available, pivid immediately requests an update for the next vsync:
19:45:49.582 38u T [display] HDMI-1 status: Update done
19:45:49.582 36u D [display] UPDATE HDMI-1 1l mbw=8% cbw=18% lbm=25%
19:45:49.582 56u D [display] pl84: fb343 1920x1080 NV12:BCOM:66004 f34=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @2.369" =>0,0
19:45:49.582 44u T [display] #84 alpha = 65535
19:45:49.582 27u T [display] #84 CRTC_ID = 94
19:45:49.582 25u T [display] #84 CRTC_X = 0
19:45:49.582 25u T [display] #84 CRTC_Y = 0
19:45:49.582 52u T [display] #84 CRTC_W = 1920
19:45:49.582 77u T [display] #84 CRTC_H = 1080
19:45:49.582 74u T [display] #84 FB_ID = 343
19:45:49.582 74u T [display] #84 SRC_X = 0
19:45:49.582 72u T [display] #84 SRC_Y = 0
19:45:49.582 76u T [display] #84 SRC_W = 125829120
19:45:49.582 58u T [display] #84 SRC_H = 70778880
19:45:49.582 75u T [display] HDMI-1 u193 committing...
19:45:49.583 313u D [display] HDMI-1 u193 committed!
Note that the actual atomic commit request takes ~1ms. The hardware vsync was :49.582, it's now :49.583, we should have 15 more milliseconds before the next vsync. Similarly, on to HDMI-2:
19:45:49.583 87u T [display] HDMI-2 status: Update done
19:45:49.583 48u D [display] UPDATE HDMI-2 1l mbw=8% cbw=18% lbm=25%
19:45:49.583 34u T [display] HDMI-1 status: Update still pending
19:45:49.583 107u D [display] pl95: fb343 1920x1080 NV12:BCOM:66004 f34=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @2.369" =>0,0
19:45:49.583 112u T [display] #95 alpha = 65535
19:45:49.583 53u T [display] #95 CRTC_ID = 105
19:45:49.583 28u T [display] #95 CRTC_X = 0
19:45:49.583 26u T [display] #95 CRTC_Y = 0
19:45:49.583 29u T [display] #95 CRTC_W = 1920
19:45:49.583 63u T [display] #95 CRTC_H = 1080
19:45:49.583 29u T [display] #95 FB_ID = 343
19:45:49.583 25u T [display] #95 SRC_X = 0
19:45:49.583 24u T [display] #95 SRC_Y = 0
19:45:49.583 27u T [display] #95 SRC_W = 125829120
19:45:49.583 26u T [display] #95 SRC_H = 70778880
19:45:49.584 66u T [display] HDMI-2 u194 committing...
19:45:49.584 137u D [display] HDMI-2 u194 committed!
Again, the update request finished quickly. Now it's :49.584. HDMI-2's last hardware flip was :49.578, so we're 6ms in, but that still leaves 10ms until the next vsync. Now pivid is waiting for both updates to complete.
19:45:49.584 70u T [display] HDMI-2 status: Update still pending
19:45:49.588 4247u T [display] HDMI-1 status: Update still pending
19:45:49.589 854u T [display] HDMI-2 status: Update still pending
19:45:49.593 4231u T [display] HDMI-1 status: Update still pending
19:45:49.594 861u T [display] HDMI-2 status: Update still pending
19:45:49.598 4240u D [display] HDMI-1 u193 flip! 19:45:49.598 (m58237.859)
19:45:49.598 42u T [display] HDMI-1 status: Update done
OK, HDMI-1 flipped as expected-- at :49.582 + 16.6ms = :49.498. Note that HDMI-2 should actually have flipped first, since it's running about 3ms earlier. Anyway, now pivid has another update for HDMI-1 (but we're still waiting for the original update
19:45:49.598 38u D [display] UPDATE HDMI-1 1l mbw=8% cbw=18% lbm=25%
19:45:49.598 56u D [display] pl84: fb351 1920x1080 NV12:BCOM:66004 f19=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @2.403" =>0,0
19:45:49.598 42u T [display] #84 alpha = 65535
19:45:49.598 30u T [display] #84 CRTC_ID = 94
19:45:49.598 27u T [display] #84 CRTC_X = 0
19:45:49.598 25u T [display] #84 CRTC_Y = 0
19:45:49.599 80u T [display] #84 CRTC_W = 1920
19:45:49.599 82u T [display] #84 CRTC_H = 1080
19:45:49.599 81u T [display] #84 FB_ID = 351
19:45:49.599 121u T [display] #84 SRC_X = 0
19:45:49.599 261u T [display] #84 SRC_Y = 0
19:45:49.599 85u T [display] #84 SRC_W = 125829120
19:45:49.599 81u T [display] #84 SRC_H = 70778880
19:45:49.599 79u T [display] HDMI-1 u195 committing...
19:45:49.599 0u T [display] Loading start 1920x1080 NV12:BCOM:66004 f28=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @2.703"
19:45:49.600 1040u D [display] HDMI-1 u195 committed!
The HDMI-1 update took 2ms this time instead of 1ms, but it's still fast. So now we've completed HDMI-1's second update (since the start of this excerpt) but are still waiting for HDMI-2's first update, even though the vsync should have been several milliseconds ago.
19:45:49.600 106u T [display] HDMI-1 status: Update still pending
19:45:49.600 85u T [display] HDMI-1 status: Update still pending
19:45:49.600 135u T [display] HDMI-2 status: Update still pending
19:45:49.600 173u T [display] Loading framebuffer...
19:45:49.600 117u D [display] LOADED fb353 1920x1080 NV12:BCOM:66004 f28=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @2.703"
19:45:49.605 4657u T [display] HDMI-1 status: Update still pending
19:45:49.605 123u T [display] HDMI-2 status: Update still pending
19:45:49.605 83u W [player] s41 Slow update: 0.005s overdue
Pivid has started noticing that HDMI-2 (aka s41... different code layer, different identifier) has failed to flip within one frame time.
19:45:49.610 4869u T [display] HDMI-1 status: Update still pending
19:45:49.610 124u T [display] HDMI-2 status: Update still pending
19:45:49.610 82u W [player] s41 Slow update: 0.010s overdue
19:45:49.615 4899u D [display] HDMI-2 u194 flip! 19:45:49.612 (m58237.872)
And there's the HDMI-2 flip. The previous observed vsync was :49.578, this one is :49.612, that's 34ms later-- despite having (relatively speaking) all the time in the world, the Linux kernel layer failed to flip the display at the first vsync after the request, and had to wait until the second vsync. Of course since we're trying to push out 60fps, that means we've fallen a frame behind, and if that happens again (which it does) pivid will drop a frame to keep up.
So the question is why, when two displays are running, the Linux KMS layer takes more than a frame interval to flip a screen. Note that the framebuffer is already fully loaded and everything, it should just be a matter of changing some pointers.
Here's a similar example with kernel debugging on. Focusing on HDMI-2, here's pivid's logs (btw, HDMI-1 has slow updates just as often), showing a very similar frame-miss:
20:03:04.660 3885u D [display] HDMI-2 u127 flip! 20:03:04.658 (m59272.919)
20:03:04.661 28u D [display] UPDATE HDMI-2 1l mbw=8% cbw=18% lbm=25%
20:03:04.664 115u D [display] pl95: fb344 1920x1080 NV12:BCOM:66004 f23=3.0M:1.9K/136K,1.9K/2.9M "jellyfish-3-mbps-hd-hevc.mkv @1.402" =>0,0
20:03:04.664 33u T [display] #95 alpha = 65535
20:03:04.664 15u T [display] #95 CRTC_ID = 105
20:03:04.664 13u T [display] #95 CRTC_X = 0
20:03:04.664 12u T [display] #95 CRTC_Y = 0
20:03:04.664 12u T [display] #95 CRTC_W = 1920
20:03:04.664 12u T [display] #95 CRTC_H = 1080
20:03:04.664 12u T [display] #95 FB_ID = 344
20:03:04.664 14u T [display] #95 SRC_X = 0
20:03:04.664 11u T [display] #95 SRC_Y = 0
20:03:04.664 12u T [display] #95 SRC_W = 125829120
20:03:04.664 11u T [display] #95 SRC_H = 70778880
20:03:04.664 11u T [display] HDMI-2 u130 committing...
20:03:04.666 2134u D [display] HDMI-2 u130 committed!
20:03:04.666 33u T [display] HDMI-2 status: Update still pending
20:03:04.671 49u T [display] HDMI-2 status: Update still pending
20:03:04.676 47u T [display] HDMI-2 status: Update still pending
20:03:04.684 29u T [display] HDMI-2 status: Update still pending
20:03:04.684 57u T [display] HDMI-2 status: Update still pending
20:03:04.684 24u W [player] s41 Slow update: 0.001s overdue
20:03:04.689 103u T [display] HDMI-2 status: Update still pending
20:03:04.689 59u W [player] s41 Slow update: 0.007s overdue
20:03:04.694 3399u D [display] HDMI-2 u130 flip! 20:03:04.691 (m59272.952)
Enabling maximum kernel debugging (via pivid_play --debug_kernel
), here's timestamp-corrected kernel logs (via pivid_inspect_kmsg
) for the same time period:
20:03:04.665 🕸️ [drm:drm_ioctl [drm]] comm="pivid:play:s41" pid=13080, dev=0xe200, auth=1, DRM_IOCTL_MODE_ATOMIC
20:03:04.665 🕸️ [drm:drm_atomic_state_init [drm]] Allocated atomic state 60407942
20:03:04.666 🕸️ [drm:drm_atomic_get_plane_state [drm]] Added [PLANE:95:plane-4] 31bb5a70 state to 60407942
20:03:04.666 🕸️ [drm:drm_atomic_get_crtc_state [drm]] Added [CRTC:105:crtc-4] cb391a47 state to 60407942
20:03:04.666 🕸️ vc4-drm gpu: [drm:drm_atomic_set_fb_for_plane [drm]] Set [FB:344] for [PLANE:95:plane-4] state 31bb5a70
20:03:04.667 🕸️ [drm:drm_atomic_check_only [drm]] checking 60407942
20:03:04.667 🕸️ [drm:drm_atomic_get_private_obj_state [drm]] Added new private object dc46d9ca state 4f787d17 to 60407942
20:03:04.667 🕸️ [drm:drm_atomic_get_private_obj_state [drm]] Added new private object ac0cbdf2 state 992e8714 to 60407942
20:03:04.667 🕸️ [drm:drm_atomic_nonblocking_commit [drm]] committing 60407942 nonblocking
So that's setting up the atomic commit, as expected...
20:03:04.675 🕸️ vc4-drm gpu: [drm:vblank_disable_fn [drm]] disabling vblank on crtc 4
20:03:04.678 🕸️ vc4-drm gpu: [drm:vc4_atomic_commit_tail [vc4]] Running the core clock at 282150000 Hz
20:03:04.678 🕸️ vc4-drm gpu: [drm:drm_calc_timestamping_constants [drm]] crtc 105: hwmode: htotal 2200, vtotal 1125, vdisply 1080
20:03:04.679 🕸️ vc4-drm gpu: [drm:vc4_atomic_commit_tail [vc4]] Core clock actual rate: 500000000 Hz
20:03:04.679 🕸️ vc4-drm gpu: [drm:drm_calc_timestamping_constants [drm]] crtc 105: clock 148500 kHz framedur 16666666 linedr 14814
20:03:04.679 🕸️ vc4-drm gpu: [drm:drm_vblank_enable [drm]] enabling vblank on crtc 4, ret: 0
Note for the above, crtc 4
(by index) is CRTC #105
(by ID) is HDMI-2
(by name). By this point (:04.679) we have already passed the vsync following the commit (would have been at :04.675). What is this extra code (vc4_atomic_commit_tail
), and why do several milliseconds elapse between the actual commit operation and the "commit tail" which seems to adjust some clocks and enable the actual vsync interrupt?
Performing the commit in blocking mode (which waits for the vsync before returning from the ioctl) may make things easier to trace...
According to https://github.com/raspberrypi/linux/issues/5094#issuecomment-1193814948, updating both displays in the same commit may sidestep this problem. (I'd need to test it.) That would require a bit of a complicated rework, though!