Uncontrolled underclocking on AMD CPUs / Chromebook.
Good evening,
There's an issue with AMD CPU A4-9120C in HP Chromebook 11A G6 EE, what's having a default 1.2 Ghz - 1.8 Ghz clock speed, with boost up to 2.4 Ghz.
Unfortunately with the latest Full ROM & older ones, CPU likes to underclock itself at the beginning of chromebook startup. The exact underclock frequency is 798 Mhz and it's mostly static.
Driver used by the Linux kernel for CPU Frequency: auto-cpufreq.
Kernel: 6.5.5
I've tried adding intel_pstate=disable amd_freq_sensitivity=off to the kernel boot params, but didn't helped.
Underclock is noticable before kernel & bootloader starts.
Only hard reset on battery power makes the CPU work normally although it might be random.
what exactly are you using to determine this?
My Stoney Chromebook here running Pop_OS! is mostly idling between 1.8-2.0GHz per /proc/cpuinfo, but bounces between 1.3 and 2.4GHz depending on what is running in the background
This sounds like a kernel issue not a firmware one.
what exactly are you using to determine this?
I've used cpupower / htop / top or whatever other tool what's reading /proc/cpuinfo.
I've tried with Debian / Ubuntu / Artix / Arch kernels and I'm getting the same issue.
Windows is reporting normal CPU speeds as well here. What version of the firmware are you running?
powertop doesn't even report 800MHz in the freq stats for me - only 1200/1400/1600 so not sure how accurate that is since not even listing the turbo boost states
FW: MrChromebox-4.21.0-8-geb419016267 (10/15/2023)
cpupower also doesn't report 800 Mhz as a supported frequency. Atm I've hard reseted chromebook to escape from underclocking. dmesg doesn't report anything related to CPU clock neither.
from what I can tell this looks to be a thermal issue, what is the CPU temp when this is happening?
38°C - 65°C
I'm at a loss, I can't reproduce and can't find anyone else with the issue either
This time I've got dmesg errors:
[12833.148139] clocksource: timekeeping watchdog on CPU1: Marking clocksource 'tsc' as unstable because the skew is too large:
[12833.148154] clocksource: 'hpet' wd_nsec: 505890273 wd_now: 89e44a wd_last: 1b5da6 mask: ffffffff
[12833.148160] clocksource: 'tsc' cs_nsec: 506839844 cs_now: 8f002d02 cs_last: 5ec157ad mask: ffffffffffffffff
[12833.148164] clocksource: Clocksource 'tsc' skewed 949571 ns (0 ms) over watchdog 'hpet' interval of 505890273 ns (505 ms)
[12833.148168] clocksource: 'tsc' is current clocksource.
[12833.148185] tsc: Marking TSC unstable due to clocksource watchdog
[12833.148213] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
[12833.148214] sched_clock: Marking unstable (12833148972745, -1325832)<-(12833150600200, -2387436)
[12833.147825] clocksource: Checking clocksource tsc synchronization from CPU 0 to CPUs 1.
[12833.147852] clocksource: Switched to clocksource hpet
Not sure at this point if the problem is about the kernel too.
although the clock speed goes back to normal (1.8 Ghz - 2.4 Ghz) from 798 Mhz when the CPU utilization is below 20%.
Can confirm, I'm having the same issue with more than one distribution (tried OpenSuse Tumbleweed and Debian 12), with both default kernels and the custom kernels provided by the Chrultrabook project.
I also tried to downgrade the firmware to each version down to the July version, but to no avail.
To exclude an issue with the frequency reporting I tried a simple zstd compression benchmark on both Linux and ChromeOS (reverting to the default firmware/image), and the ChromeOS is in fact ~5-6 times faster, all other things equal.
One of the consistently reported difference is on the lscpu output, where CPU(s) scaling MHz is stuck on 50% on UEFI Linux, and above 100% on ChromeOS.
As I'm not sure what else to test, if there are suggestions I may debug further.
bulldozer throttles at 60 C. Repaste the CPU and it'll clock higher
mine does at whatever temperature. Thermal paste is new on the CPU.
Is there any way to forcefully disable thermal protection and everything related to that in the firmware to check if that may cause the issue?
~~Adding mitigations=off to the kernel parameters seems to have fixed the issue although it exposes multiple vulnerabilities on the way.~~ EDIT: It solves the issue to not appear for a longer while, but at some point the CPU speed will drop from 2.4 Ghz to 798 Mhz if the usage would be at 100% for a longer while.
lscpu output after rebooting the system:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Vendor ID: AuthenticAMD
Model name: AMD A4-9120C RADEON R4, 5 COMPUTE CORES 2C+3G
CPU family: 21
Model: 112
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
Stepping: 0
Frequency boost: enabled
CPU(s) scaling MHz: 119%
CPU max MHz: 1600,0000
CPU min MHz: 1200,0000
BogoMIPS: 3195,13
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
a cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall n
x mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_go
od acc_power nopl nonstop_tsc cpuid extd_apicid aperfmp
erf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2
movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm
extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch
osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm per
fctr_core perfctr_nb bpext ptsc mwaitx cpb hw_pstate ss
bd vmmcall fsgsbase bmi1 avx2 smep bmi2 xsaveopt arat n
pt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushby
asid decodeassists pausefilter pfthreshold avic v_vmsav
e_vmload vgif overflow_recov
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 64 KiB (2 instances)
L1i: 128 KiB (2 instances)
L2: 2 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0,1
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Vulnerable
Spec rstack overflow: Not affected
Spec store bypass: Vulnerable
Spectre v1: Vulnerable: __user pointer sanitization and usercopy ba
rriers only; no swapgs barriers
Spectre v2: Vulnerable, STIBP: disabled, PBRSB-eIBRS: Not affected
Srbds: Not affected
Tsx async abort: Not affected
mine does at whatever temperature. Thermal paste is new on the CPU.
T
bulldozer throttles at 60 C. Repaste the CPU and it'll clock higher
I had this exact same issue. But a different CPU architecture.
My Lenovo c13 yoga did only do 400mhz on each core, I tried to roll back into stock firmware it still did some weird CPU throttling. I was sure it was some bios issue not able to scale back from power saving mode. In last attempt I had to open the laptop to disconnect the laptop to enable WP. I left the lid open and the throttling was gone. Regardless firmware used.
Repasted tfe CPU as it was stale paste on CPU. Now it all works well, Guess what I'm trying to say - thank you!
means I have to enable Write-Protect mode in order to fix it? (I've put a new paste again on both sides of the cooler and the issue still occurs).
I have this problem too. I can cause it by hard reset "refresh+power", after that:
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 798.523
cpu MHz : 798.498
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 798.507
cpu MHz : 798.514
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 798.536
cpu MHz : 798.504
Reboot dont help, but to fix it you must shutdown and wait 10-15sec. After that:
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 2183.506
cpu MHz : 2178.905
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 2395.613
cpu MHz : 2395.610
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 2395.633
cpu MHz : 2395.572
stefan@debian:~$ cat /proc/cpuinfo | grep Hz
cpu MHz : 1197.809
cpu MHz : 1197.809
My chromebook:
stefan@debian:~$ neofetch
_,met$$$$$gg. stefan@debian
,g$$$$$$$$$$$$$$$P. -------------
,g$$P" """Y$$.". OS: Debian GNU/Linux 12 (bookworm) x86_64
,$$P' `$$$. Host: Barla rev6
',$$P ,ggs. `$$b: Kernel: 6.5.6-stoney
`d$$' ,$P"' . $$$ Uptime: 16 mins
$$P d$' , $$P Packages: 1557 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.2.15
$$; Y$b._ _,d$P' Resolution: 1366x768
Y$$. `.`"Y$$$$P"' DE: Xfce 4.18
`$$b "-.__ WM: Xfwm4
`Y$$ WM Theme: Default
`Y$$. Theme: Xfce [GTK2]
`$$b. Icons: Tango [GTK2]
`Y$$b. Terminal: xfce4-terminal
`"Y$b._ Terminal Font: Monospace 12
`""" CPU: AMD A4-9120C RADEON R4 2C+3G (2) @ 1.600GHz
GPU: AMD ATI Radeon R2/R3/R4/R5 Graphics
Memory: 1207MiB / 3632MiB
GPU clock is also affected (600 Mhz -> 200 Mhz). Not sure about the RAM speed.
I've tried switching to AMD P-State driver but no luck, throttling still occurs after an hour.
These Chromebooks aren't really designed for heavy duty tasks such as gaming. The cooling solution is minimal and over short amount of time it may overheat and start throttling.
I have tried many different kernels and variations of flavoured operating systems seem to not change the outcome.
Other vendors have been doing some bios update to resolve the switching back to normal operational mode happen. For now using refresh+power button did work for me too. It only works temporarily until next time it overheats, but it is easier than removing the battery.
These Chromebooks aren't really designed for heavy duty tasks such as gaming.
I'm using it for web browsing and SSH/terminal stuff.
The cooling solution is minimal and over short amount of time it may overheat and start throttling.
Problem occurs after leaving it for a longer while it's open & idle. Throttling can be reloaded through putting that Chromebook for a short-time sleep (1 seconds - 3 seconds or longer) and throttling is gone.
If it would be a normal type of throttling, the CPU speed would stay at 800 Mhz after suspend.
I've tried running Windows 11 on it and it's lowering the CPU frequency down by 0.01 Ghz whenever CPU load isn't 100% and is bigger than 80% but quickly goes back to 2.4 Ghz. After some time it goes to 0.8 Ghz and system misinterprets CPU load as the value percentage between 1.60 Ghz and 0 Ghz. It looks like that while the CPU load should be 100%:
Some minutes before the behavior was normal:
Similar behavior occurred on ChromeOS Flex (not stock ChromeOS) during the CPU test or playing a Full HD video, except the CPU load was shown at 100% while the CPU speed was 1.2 Ghz.
When bringing back stock ChromeOS with stock ROM the problem was still occurring, but after 2 restarts it's gone.
No idea how can I test it further since I couldn't control CPU frequency. I had some other PC with similar behavior and disabling CPU freq boost in the BIOS fixed the issue but Coreboot on Chromebook doesn't contain that option.
I also have a barla with the exact same issue. I am running Fw Ver: MrChromebox-4.21.1 on archlinux Kernel: 6.6.16-stoney CPU: AMD A4-9120C RADEON R4 2C+3G (2)
doing systemctl poweroff and waiting 15 secs resolves the issue
I also have a G6 11A EE. I have been pulling my hair out, building custom kernels, installing debian sid, stable, oldstable, oldoldstable... thinking it was a kernel issue. At one point I also thought it was solved by disabling CPU mitigations, or compiling the kernel without them. I compiled a kernel with all intel eist / p4 clockmod (any intel power drivers) and intel microcode removed. I was almost sure this did the trick and benchmarks confirmed it. I was careful to not change the kernel or any firmware versions. And my system kept freaking slowing down to quite honestly an unusable miserable speed. I use lightweight DEs, XFCE or LXDE. Sometimes I even use Seamonkey instead of Firefox, when I can anyways. And this laptop was still miserable to use.
The posts above are correct though. If you power the machine OFF completely, let it sit a minute (>15 seconds, as stated), and power it on by opening the lid, it will boot up with proper CPU scaling.
I used openarena as a performance test, since it is in all debian versions going back to kernel 4.x even, and when this bug occurs it jitters quite a bit, even in 640x480 windowed all low settings. However, shut the box down, leave off 30 seconds & open lid, and it will run the game @ 1366x768 32 bit all settings high or maxed out, bloom and flare on, even 2x antialiasing and it runs at what looks to be a silky smooth 60 FPS even with the governor set to "powersave"
Since I've had this problem on kernels ranging from 4x - current 6.6, wouldn't this be a firmware issue?? Thank you for all your work, it is incredibly awesome that you've enabled so many of these machines to avoid their destiny in the landfills. You seriously rock, and I have a ton of respect Mr Chromebox. Thank you for doing what you do.
so firmware wise, there hasn't been much change to Stoneyridge in quite some time, though there was a change to the SMU (power mgmt) firmware back in Oct 2023. So this change likely would have affected the 4.22 firmware as well as the current 2405 firmware. If someone wants to test a previous release (4.20 has the old SMU firmware) we can rule that in/out pretty easily. Or I can do a test build of 2405 with the old SMU firmware
I'll gladly try the previous firmware, would love to be of any help. Can you point me towards it? I installed what I'm using now /w your script from inside Chrome OS.
drop me an email or ping me on Discord and I'll sort you out to flash an older build / test version
i would like to try the older firmware as well
i would like to try the older firmware as well
there's no reason for multiple people to try the same thing, especially when it didn't work for the first person