throttled icon indicating copy to clipboard operation
throttled copied to clipboard

Write to unrecognized MSR on Linux 5.9

Open cedws opened this issue 3 years ago • 32 comments

I recently updated to kernel 5.9 and started seeing the kernel log this, seems to be caused by throttled:

[  140.603433] msr: Write to unrecognized MSR 0x1a2 by python3
               Please report to [email protected]
[  140.603516] msr: Write to unrecognized MSR 0x64b by python3
               Please report to [email protected]

The article here seems to suggest these logs are just for information purposes and there shouldn't be any impact on userspace programs. Nevertheless, I wanted to open this issue for tracking this and maybe putting something in the README.

cedws avatar Sep 22 '20 20:09 cedws

I guess that in the near future the kernel will disable MSR writes from user space definitely :(

erpalma avatar Sep 23 '20 06:09 erpalma

That seems to be the end goal yeah. There'll probably be a kernel parameter to disable the lockdown measures... but then there's the question of whether you really want to do that. I'm sure somebody else will raise this with them.

cedws avatar Sep 23 '20 11:09 cedws

You have to add this to grub:

GRUB_CMDLINE_LINUX="msr.allow_writes=on"

and update grub ;)

eXt73 avatar Oct 12 '20 12:10 eXt73

Adding msr.allow_writes=on to the kernel cmdline definitely works for now (tested on 5.9.0). However, writing to MSRs will probably become impossible in the future.

Backporting the MSR-driver to have the current behaviour on newer kernels OR a reimplementation of throttled as a kernel-module might be the only way in the midterm.

Thoughts?

gladiac avatar Oct 13 '20 13:10 gladiac

In such a situation, I will modify the kernel code so that it will still be possible with the use of this 'flag': msr.allow_writes=on and of course I will make available such built and optimized kernels, as all my builds under ours Netext'73 > https://www.netext73.pl/

... or I will extend my systemd service and bypass these limitations ;)

Lenovo 720s-14IKB

https://www.dropbox.com/s/6pow72x9xf19fi1/Screenshot_20201013_172706.png?dl=0

https://www.dropbox.com/s/7wd1e0fnq04f0ff/Screenshot_20201013_174038.png?dl=0

eXt73 avatar Oct 13 '20 15:10 eXt73

How about, instead of bypassing something which we added there for a good reason, you guys work with us?

For example, the 0x1a2 MSR is accessible to userspace through the drivers/thermal/intel/int340x_thermal/processor_thermal_device.c driver. On machines which have that hw, there should be a sysfs file called "tcc_offset_degree_celsius" which gives you the TCC activation offset. We're open to suggestions how to extend that interface so that your tool can read it from sysfs instead of poking at MSRs.

The other MSR above is MSR_CONFIG_TDP_CONTROL and the kernel uses it in a bunch of places. It looks like throttled wants to set cTDP so exposing that functionality in sysfs shouldn't be a big deal AFAICT.

So let's do this right please and stop poking at the naked MSRs because it is a very bad idea.

Thx.

bp3tk0v avatar Oct 20 '20 10:10 bp3tk0v

How about, instead of bypassing something which we added there for a good reason, you guys work with us?

This tool started just as a simple way "to fix my own pc". I agree with you that the right decision would be to use specific sysfs instead of raw MSRs. I would be very glad to upgrade this tool if you are going to help us by submitting the required patches for the kernel.

erpalma avatar Oct 21 '20 08:10 erpalma

This tool started just as a simple way "to fix my own pc". I agree with you that the right decision would be to use specific sysfs instead of raw MSRs. I would be very glad to upgrade this tool if you are going to help us by submitting the required patches for the kernel.

Cool, I'd be glad to.

So how about you send a mail to x86-at-kernel.org (replace the "-at-" with you know what :)) with what exactly you'd like to read out/program from/to which MSRs and I'll CC the relevant people and we'll start the ball rolling. From initial staring, some of the info you need we export already - it'll just need to be extended/designed properly so yours and other tools can use it too.

Thx.

bp3tk0v avatar Oct 21 '20 09:10 bp3tk0v

The workaround is not working for me.

I added msr.allow_writes=on to my boot string and rebuilt Grub (I did both those things using the Grub Customizer program) I rebooted. My log is still being flooded with the message at issue.

Mint 20 x64 Cinnamon, kernel 5.9.1-050901-generic. My full kernel/boot string: acpi=force cpuidle.governor=teo i915.enable_fbc=1 i915.fastboot=1 pcie_aspm=force mitigations=off psmouse.synaptics_intertouch=1 quiet reboot=w splash msr.allow_writes=on. Computer: X1CG6.

I will try with the 5.9.0 kernel. EDIT: on the 5.9.0 kernel, as against 5.9.1, the workaround does stop the log flood.

LinuxOnTheDesktop avatar Oct 22 '20 03:10 LinuxOnTheDesktop

@bp3tk0v I guess something is already moving at the kernel ML!

erpalma avatar Oct 22 '20 07:10 erpalma

The workaround is not working for me. I will try with the 5.9.0 kernel. EDIT: on the 5.9.0 kernel, as against 5.9.1, the workaround does stop the log flood.

This thing must be messed up in the kernel you are using - see my sceen - everything flashes under my builds ... I am even thinking about modifying the kernel code, but for now it is enough to add a reference to the grub ... under my build 5.9.1:

https://www.dropbox.com/s/xzkxf9qtuyfqokc/Screenshot_20201022_093136.png?dl=0

eXt73 avatar Oct 22 '20 07:10 eXt73

@bp3tk0v I guess something is already moving at the kernel ML!

Yeah, that's me poking at people to get this thing moving. Thus it will be important if you give your requirements about what you want to access through the MSRs so that you can use that interface in your tool too.

Thx.

bp3tk0v avatar Oct 22 '20 09:10 bp3tk0v

This issue also effects performance of USB devices connected downstream on a USB 2.0 and 3.0 hub

grealish avatar Nov 25 '20 11:11 grealish

This issue also effects performance of USB devices connected downstream on a USB 2.0 and 3.0 hub

How so? I'm very sceptical it does anything but please elaborate.

bp3tk0v avatar Nov 25 '20 11:11 bp3tk0v

Not really related to this repo, but will leave it there:

[    5.832950] msr: Write to unrecognized MSR 0x17f by mcelog
               Please report to [email protected]
  • Fedora: 5.9.11-200.fc33.x86_64
  • Wayland
  • amdgpu

dzintars avatar Dec 04 '20 23:12 dzintars

[ 5.832950] msr: Write to unrecognized MSR 0x17f by mcelog

We have a fix queued:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=ras/core&id=68299a42f84288537ee3420c431ac0115ccb90b1

and mcelog will stop poking at that MSR after that.

HTH.

bp3tk0v avatar Dec 04 '20 23:12 bp3tk0v

The workaround is not working for me.

I added msr.allow_writes=on to my boot string and rebuilt Grub (I did both those things using the Grub Customizer program) I rebooted. My log is still being flooded with the message at issue.

Mint 20 x64 Cinnamon, kernel 5.9.1-050901-generic. My full kernel/boot string: acpi=force cpuidle.governor=teo i915.enable_fbc=1 i915.fastboot=1 pcie_aspm=force mitigations=off psmouse.synaptics_intertouch=1 quiet reboot=w splash msr.allow_writes=on. Computer: X1CG6.

I will try with the 5.9.0 kernel. EDIT: on the 5.9.0 kernel, as against 5.9.1, the workaround does stop the log flood.

This works for me on Arch, X1CG7, systemd-boot

Also (sorry for going off-topic here) TIL about cpuidle.governor=teo which solves a high CPU load issue that I've had for a long time with UAC-2 devices on my X1CG7 :bow: https://bbs.archlinux.org/viewtopic.php?pid=1924581

gmasgras avatar Dec 18 '20 05:12 gmasgras

@eXt73: thank you for your post above. Using the boot switch workaround, and on kernel 5.9.16, I can confirm that the throttle software works - well, unless the following new error (new to that kernel) is relevant.

alsactl[887]: alsa-lib main.c:1021:(snd_use_case_mgr_open) error: failed to import hw:0 (empty configuration)

Linux Mint 20 x64 Cinnamon

LinuxOnTheDesktop avatar Dec 29 '20 06:12 LinuxOnTheDesktop

@LinuxOnTheDesktop that's related to ALSA, which is a sound card framework.

erpalma avatar Dec 29 '20 10:12 erpalma

Hello (and sorry to moan)

On kernel 5.10.25-051025-generic I still see many instances of msr: Write to unrecognized MSR 0x1a2 by python3 despite having booted with lsm=capability,yama (my /sys/kernel/security/lsm comprising: 'lockdown,capability,yama').

My OS: Mint Cinnamon 20.1. My version of throttled: the latest, got from git just now.

LinuxOnTheDesktop avatar Mar 26 '21 03:03 LinuxOnTheDesktop

@bp3tk0v Do you know if there is any progress on exposing these knobs through sysfs? The LKML thread ended in October. Seems like energy_perf_bias now exists and the in-tree utilities have migrated to that, but that's all I see now.

angelsl avatar Apr 06 '21 15:04 angelsl

Well, there were some good ideas at the end of that thread:

https://lore.kernel.org/lkml/[email protected]/T/#mc47d8b97df049bc62001ceeeb315c1bdb6f35ff6

but someone needs to actually try them. :-\ For example, I'd take an undervolting driver into the kernel any day of the week if it is done somewhat sane. And it doesn't have to be perfect - we can always improve it incrementally like we always do.

bp3tk0v avatar Apr 08 '21 10:04 bp3tk0v

On kernel 5.10.25-051025-generic I still see many instances of msr: Write to unrecognized MSR 0x1a2 by python3

Does this explain it: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about ?

bp3tk0v avatar Apr 08 '21 10:04 bp3tk0v

@bp3tk0v: that page does explain things. For, that page tells me - as someone might have told me already - that our erpalma is, er, working to fix this problem (and it is a problem, for logs needs to be useful and disks should not be written to if one can avoid it).

LinuxOnTheDesktop avatar Apr 08 '21 12:04 LinuxOnTheDesktop

Oh they're useful.

bp3tk0v avatar Apr 08 '21 13:04 bp3tk0v

I was just wondering if there has been any movement on this?

neil1969 avatar Oct 01 '21 15:10 neil1969

I cannot cope with this log spamming; I have disabled the throttled service (sudo systemctl disable --now lenovo_fix.service).

EDIT: there is hope - see my post below.

LinuxOnTheDesktop avatar Oct 04 '21 00:10 LinuxOnTheDesktop

I find that, on Mint 20, kernel 5.11, and the latest git version of the throttle-fix, the log spamming is stopped by this boot switch: msr.allow_writes=on.

LinuxOnTheDesktop avatar Oct 28 '21 05:10 LinuxOnTheDesktop

That's quite strange since I already write that param at runtime...

erpalma avatar Oct 30 '21 08:10 erpalma

@erpalma

Thanks. Do you mean that Throttled adjusts one's kernel boot switches? Or do you mean something else?

LinuxOnTheDesktop avatar Oct 30 '21 11:10 LinuxOnTheDesktop