firmware
firmware copied to clipboard
HDMI screen turns off when sustained high CPU usage on all cores
Describe the bug
When CPU cores are under heavy usage (4 cores at ~100%, as with building software using make -j4
) for a certain time, HDMI is turned off.
The process doing intensive CPU usage continues and completes without problems.
To reproduce
-Update to latest firmware with rpi-update
-Try to build a big C/C++ project via make -j4
Expected behaviour Simply build the thing.
Actual behaviour HDMI is turned off. The high CPU process is finished without problems.
System Raspberry Pi 4b+ v1.2, 2GB RAM,
OS version:
pi@raspberrypi:~/src/Commander-Genius/b4 $ cat /etc/rpi-issue
Raspberry Pi reference 2020-05-27
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 30e2dd32ba47cc3bec15ab1413c16a17e5797775, stage4
Firmware version:
pi@raspberrypi:~ $ vcgencmd version
Jul 14 2021 14:20:55
Copyright (c) 2012 Broadcom
version 1ecd7d49359f3b48737f1a9e33c2f1513f90743d (clean) (release) (start)
Kernel version:
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 5.10.49-v8+ #1436 SMP PREEMPT Wed Jul 14 14:20:10 BST 2021 aarch64 GNU/Linux
Additional context It started to happen after updating from 5.10.17 or so. Didn't happen before.
Also, I use the vc4-hdmi audio device, so in config.txt I have commented out the BCM audio module:
dtoverlay=vc4-kms-v3d
#dtparam=audio=on
Exact rpi-update version when this started would be useful. Are you using fkms or kms driver? Are you saying hdmi output returns when make completes?
@popcornmix Sadly I can't say when this started because I had been months without updating. Any revision you suspect and I can "force"? (Along with how to force it, which I have never done)
I am using KMS.
HDMI does not come back until I reboot.
It's not an issue I've seen myself or seen reported, so can't really guess. Does dmesg have any errors after this?
What resolution/refresh rate is hdmi monitor? Does a lower one still have the issue? Does force_turbo=1 help?
@popcornmix I cleared dmesg, then ran compilation on all four cores, then just after the monitor goes off, I get these on dmesg:
pi@raspberrypi:~ $ dmesg
[ 93.150825] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] flip_done timed out
[ 103.390806] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:76:crtc-3] flip_done timed out
[ 113.630833] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:32:HDMI-A-1] flip_done timed out
About resolution and refresh rate, I am using this mode in config.txt to force the mode of my choice:
hdmi_group=2
hdmi_mode=39
force_turbo=1
has no effect.
config_hdmi_boost=4
has no effect.
Setting a "secure" video mode like this:
hdmi_group=2
hdmi_mode=4
...has no effect.
About resolution and refresh rate, I am using this mode in config.txt to force the mode of my choice:
config.txt settings only affect the initial simple framebuffer mode before kms takes over. You should be using standard linux methods for configuring the hdmi mode (e.g. arandr if using X or a "video=" cmdline.txt setting.
What is the resolution/refresh rate you are actually using before you get the "flip_done timeout"? What does "vcgencmd get_throttled" report after the "flip_done timeout"?
@popcornmix
For these experiments, I have added this to commandline.txt
video=HDMI-A-1:640x480@60
And I am also setting this in config.txt as I said:
hdmi_group=2
hdmi_mode=4
So I am, in fact, using the basic 640x480 at 60Hz video mode. That doesn't change anything (except the console resolution, of course!)
As for the command you asked, this is what it says:
pi@raspberrypi:~/src/sm64ex-alo $ vcgencmd get_throttled
throttled=0xe0000
So I am, in fact, using the basic 640x480 at 60Hz video mode. That doesn't change anything (except the console resolution, of course!)
I'm still not clear if you are saying the "flip done timeout" is occurring at 640x480@60Hz or a higher resolution. The hdmi mode can be set in many places: initially by firmware (affected by hdmi_group/hdmi mode in config.txt) by kernel when creating the console for kms/fkms (affected by video=<> in cmdline.txt) by user code like X (e.g. setting the last mode configured with arandr). by other applications that can do modesetting (e.g. kodi).
what is the resolution you are running at when you get "flip done timeout" and what were you running? (e.g. X etc)
@popcornmix I usually run my Pi at 1360x768, but when I am running what you ask me to run, etc... then I move to basic 640x480@60Hz. There's nothing wrong with the video mode in use, and it has no impact on the issue, because I have tried a lot of different video modes, from basic 640x480@60Hz to 1080p, and the issue is the same.
So, simply put, I am using 640x480@60 when I report anything on this thread.
I don't have an X server. TTY console, of course, runs on legacy fbdev, just like in every GNU/Linux system as far as I know. When I do compilation on all four cores, there's NOTHING running except the fbdev TTY console.
So, to be clear: I have no Xorg server. I set the video mode in config.txt or in cmdline.txt. Since you told me to use the "video=..." directive in cmline.txt, that's what I use. That's all.
Does the "flip done" message always correspond to vcgencmd get_throttled
returning a non-zero value (i.e. throttling occurring)?
@popcornmix What happens is this.
On idle system, or when running, let's say, SDLPop, Scummvm, etc... I normally get this:
pi@raspberrypi:~/src/Raze/b4 $ vcgencmd get_throttled
throttled=0x0
But after a couple of seconds of building anything with -j3, -j4, etc.. screen goes off, and I get this:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x0
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x80008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0006
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0006
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0006
pi@raspberrypi:~ $ vcgencmd get_throttled
As you can see, initially it returns zero, and then non-zero values as monitor goes off.
It's always the same.
@popcornmix I went back from kernel 5.10.50 to to stable 5.10.17 (remember I am on a full aarch64 Raspberry Pi OS) via:
sudo apt-get install --reinstall raspberrypi-bootloader raspberrypi-kernel
...and after rebooting, I can do whatever I want on all the CPU cores: no more HDMI display turn-off.
Also, strange lock-ups I had reported on different opensource game engines have simply dissapeared after going back to stable! https://github.com/lethal-guitar/RigelEngine/issues/662 https://gitlab.com/Dringgstein/Commander-Genius/-/issues/491
@popcornmix Another thing to note is that I always use the vc4-hdmi device (ARM-side ALSA driver). I have added that information to the first post.
For devs trying to reproduce this issue locally on Pi OS 64bits: simply build a large project in C++ (not plain C) using make -j4
In case you still can't see it happening, do it as root or remove your user's rlimits so you can really cause a 100% CPU usage.
The HDMI display will be turned off, that's for sure. It happens with every monitor I use. Official cable & power supply here, btw.
Now 5.10.52 is the "stable" kernel, so it happens again on my system after doing a simple sudo apt-get update && upgrade
.
@vanfanel can you give this test firmware a try? I think the issue is when arm throttles (due to high temperature) it was incorrectly reducing core frequency below that required for the hdmi mode.
@popcornmix Tested, but I am sorry to say that it's still happening. Any other experimental firmwares you want me to try, I'll be glad to do so.
Can you report output of vcgencmd version
when using the test firmware?
Can you confirm that display is fine when you have throttled=0x0
or throttled=0x80008
, but occurs when any addional bits are set?
@popcornmix:
pi@raspberrypi:~ $ vcgencmd version
Aug 13 2021 13:03:32
Copyright (c) 2012 Broadcom
version 5ffbdf498f77137ac0fbb2f63214eeb3346a3969 (tainted) (release) (start)
Also, when display is fine (on an idle system), I always get:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x0
Then after 1 minute building a C++ project with ninja -j4
or make -j4
I start seeing:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x80000
Then later I see:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x80008
At this point, the display is turned off And then, just when display is turned off I see this:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0006
And then, while the display is still off, I see:
pi@raspberrypi:~ $ vcgencmd get_throttled
throttled=0xe0008
...from this point (remember: display is off and won't come back even if the compilation finishes OK), these two 0xe0006
and 0xe0008
alternate. Display is off until I reboot.
This is very surprising, as I can reproduce your description easily (you can even set temp_limit=65
to make it happen more quickly).
With default firmware if I'm in a 4kp60 mode and I hit the temperature limit (signalled by THROTTLED_HIGH_TEMP=2
and THROTTLED_LIMIT_TURBO=4
) then core freq gets lowered to 200MHz which isn't enough to sustain 4kp60 and we lose (permanently) display output.
However with the test firmware we no longer limit the core frequency and this issue doesn't occur.
I've just reproduced this on a different Pi4 and it still fixes it.
Can you post your config.txt and cmdline.txt in case they are having an effect.
Also report vcgencmd measure_clock core
before and after the hdmi output is lost.
1 - Before display goes off:
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
(Value is stable until display goes off)
2 - After display goes off:
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=199995120
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=499987808
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=500000992
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=200008304
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
pi@raspberrypi:~ $ vcgencmd measure_clock core
frequency(1)=333333984
Now, this is my config.txt
config.txt
and this is my cmdline.txt
cmdline.txt
You will see I am using an slight overclock, but it goes with the corresponding overvoltage. Using official HDMI cable and power supply.
Okay, it's
hdmi_group=2
hdmi_mode=39 # 1360x768 60Hz
that stops it working. If you remove that I suspect the issue will be resolved.
Note that hdmi_mode
/hdmi_group
(and pretty much all hdmi_
settings) don't work with the kms driver (which is driven by settings on the arm side).
But I'll try to find out why they don't play well with kms and the temperature limit.
@popcornmix I have removed every hdmi_*
setting and I am still seeing the issue.
It must be annoying, sorry.
It does seem there are two similar issues here. The first involves core_freq being set too low when throttling. That is easy to reproduce. Use a high clock rate hdmi mode (e.g. 4kp60) and throttle. That is fixed with test firmware (and now rpi-update).
The second seems to be the M2MC clock (aka hsm clock in kernel driver) being set to 0.
It is noticed when clocks change after throttling, but the problem seems to occur earlier,
and it seems to occur with certain hdmi resolutions (possibly your edid gives the same hdmi mode without the hdmi_
settings).
@popcornmix You are right, it's only happening with the 1360x758 video mode. I have tried other resolutions via the "video=..." kernel parameter, and the issue doesn't show it's ugly face.
This thread is worth a read for the 1366x768 mode. https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=284866
@JamesH65 it's 1360x768, and the display is fine up until the first throttle, so I don't think that thread is relevant.
@vanfanel can you try this test firmware
@popcornmix Tried! The issue at 1360x768 (forced in cmdline.txt via the video=...
parameter) is not appearing anymore.
So, if you have no objections, this could be closed.
Thanks a lot for looking into this, for the time you invested into fixing the issue (and discovering it, to begin with... happening with a video mode only was unexpected).
Fix should be in latest rpi-update firmware.
@popcornmix Same HDMI poweroff on throttling is happening again in kernel 5.10.76-v8 with video="HDMI-A-1:1280x720@60"