throttled icon indicating copy to clipboard operation
throttled copied to clipboard

I/O Error: fatal('Unable to read to MSR {:x}. Unknown error.'.format(msr))

Open DrumSlayers opened this issue 3 years ago • 16 comments

Hi, the issue you fixed on https://github.com/erpalma/throttled/issues/236 is still going by my side Kernel 5.10, arch linux, HP Pavilion Power CB035NF, lsm lockdown removed.. I commented the HWP_Mode line in the config but it change nothing

rdmsr -a 0x774 rdmsr: CPU 0 cannot read MSR 0x00000774

Same on systemctl status

[drumslayer@DrumSlayer-LaptopHP-Manjaro throttled]$ sudo systemctl status lenovo_fix
● lenovo_fix.service - Stop Intel throttling
     Loaded: loaded (/etc/systemd/system/lenovo_fix.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Sat 2021-05-08 00:43:06 CEST; 3s ago
    Process: 28986 ExecStart=/opt/lenovo_fix/venv/bin/python3 /opt/lenovo_fix/lenovo_fix.py (code=exited, status=1/FAILURE)
   Main PID: 28986 (code=exited, status=1/FAILURE)

mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:     main()
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:   File "/opt/lenovo_fix/lenovo_fix.py", line 933, in main
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:     set_hwp(config.getboolean('AC', 'HWP_Mode', fallback=False))
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:   File "/opt/lenovo_fix/lenovo_fix.py", line 630, in set_hwp
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:     cur_val = readmsr('IA32_HWP_REQUEST', cpu=0)
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:   File "/opt/lenovo_fix/lenovo_fix.py", line 263, in readmsr
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]:     fatal('Unable to read to MSR {:x}. Unknown error.'.format(msr))
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro python3[28986]: ValueError: Unknown format code 'x' for object of type 'str'
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro systemd[1]: lenovo_fix.service: Main process exited, code=exited, status=1/FAILURE
mai 08 00:43:06 DrumSlayer-LaptopHP-Manjaro systemd[1]: lenovo_fix.service: Failed with result 'exit-code'.

I have the latest version of lenovo_fix, and the HWP_Mode is commented in config (/etc/lenovo_fix.cfg)

What can i do ? Thanks

DrumSlayers avatar May 08 '21 20:05 DrumSlayers

I forgot to update the calls to fatal in writemsr and readmsr functions. Now it should be fixed.

erpalma avatar May 10 '21 18:05 erpalma

I forgot to update the calls to fatal in writemsr and readmsr functions. Now it should be fixed.

Hi, thank for the fix It actually did fix the issue mentionned, but still not working :/

[drumslayer@DrumSlayer-LaptopHP-Manjaro throttled]$ sudo systemctl status lenovo_fix.service
● lenovo_fix.service - Stop Intel throttling
     Loaded: loaded (/etc/systemd/system/lenovo_fix.service; enabled; vendor preset: disabled)
     Active: failed (Result: exit-code) since Wed 2021-05-12 18:46:27 CEST; 3s ago
    Process: 42985 ExecStart=/opt/lenovo_fix/venv/bin/python3 /opt/lenovo_fix/lenovo_fix.py (code=exited, status=1/FAILURE)
   Main PID: 42985 (code=exited, status=1/FAILURE)

mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro systemd[1]: Started Stop Intel throttling.
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [I] Detected CPU architecture: Intel KabylakeG
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [I] Trying to unlock MSR allow_writes.
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [I] Loading config file.
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [W] No valid Sysfs_Power_Path found! Trying upower method #1
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [W] Trying upower method #2
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro python3[42985]: [E] Unable to read to MSR IA32_HWP_REQUEST (774). Unknown error.
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro systemd[1]: lenovo_fix.service: Main process exited, code=exited, status=1/FAILURE
mai 12 18:46:27 DrumSlayer-LaptopHP-Manjaro systemd[1]: lenovo_fix.service: Failed with result 'exit-code'.

./lenovo_fix --debug:

[drumslayer@DrumSlayer-LaptopHP-Manjaro throttled]$ sudo ./lenovo_fix.py --debug
[I] Detected CPU architecture: Intel KabylakeG
[I] Trying to unlock MSR allow_writes.
[I] Loading config file.
[W] No valid Sysfs_Power_Path found! Trying upower method #1
[W] Trying upower method #2
[D] cpu platform info: maximum non turbo ratio = 25
[D] cpu platform info: maximum efficiency ratio = 8
[D] cpu platform info: minimum operating ratio = 8
[D] cpu platform info: feature ppin cap = 0
[D] cpu platform info: feature programmable turbo ratio = 1
[D] cpu platform info: feature programmable tdp limit = 1
[D] cpu platform info: number of additional tdp profiles = 1
[D] cpu platform info: feature programmable temperature target = 1
[D] cpu platform info: feature low power mode = 1
[D] Undervolt plane CORE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane GPU - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane CACHE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane UNCORE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane ANALOGIO - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[E] Unable to read to MSR IA32_HWP_REQUEST (774). Unknown error.

I'm lost actually ..

DrumSlayers avatar May 12 '21 16:05 DrumSlayers

That MSR is just not readable, it could be caused either by the kernel or BIOS.

erpalma avatar May 13 '21 18:05 erpalma

That MSR is just not readable, it could be caused either by the kernel or BIOS.

Yeah i trust you, but actually lenovo_fix was able to work before i updated it

DrumSlayers avatar May 13 '21 23:05 DrumSlayers

Please try running rdmsr -a 0x774

erpalma avatar May 14 '21 12:05 erpalma

Lol i'm seeing this same thing. Guess i gotta downgrade again 😞

I honestly don't understand how this can break so frequently. I'm a few hair pulls from just checking out git.

goodboy avatar May 25 '21 02:05 goodboy

I have to downgrade to version 0.7-1 (not that that's the latest it just happens to be the one with a cached pkg on my machine) to get it working again.

For your debugging purposes:

 >>> sudo rdmsr -a 0x774
rdmsr: CPU 0 cannot read MSR 0x00000774

running the latest version on arch gives:

>>> sudo ./lenovo_fix.py  --debug
[I] Detected CPU architecture: Intel Broadwell-U
[I] Trying to unlock MSR allow_writes.
[I] Loading config file.
[D] cpu platform info: maximum non turbo ratio = 24
[D] cpu platform info: maximum efficiency ratio = 5
[D] cpu platform info: minimum operating ratio = 5
[D] cpu platform info: feature ppin cap = 0
[D] cpu platform info: feature programmable turbo ratio = 1
[D] cpu platform info: feature programmable tdp limit = 1
[D] cpu platform info: number of additional tdp profiles = 1
[D] cpu platform info: feature programmable temperature target = 1
[D] cpu platform info: feature low power mode = 1
[D] Undervolt plane CORE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane GPU - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane CACHE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane UNCORE - write 0 mV (0x0) - read 0 mV (0x0) - match OK
[D] Undervolt plane ANALOGIO - write 0 mV (0x0) - read 0 mV (0x0) - match OK
Traceback (most recent call last):
  File "/usr/lib/throttled/./lenovo_fix.py", line 251, in readmsr
    val = struct.unpack('Q', os.read(f, 8))[0]
OSError: [Errno 5] Input/output error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/throttled/./lenovo_fix.py", line 991, in <module>
    main()
  File "/usr/lib/throttled/./lenovo_fix.py", line 933, in main
    set_hwp(config.getboolean('AC', 'HWP_Mode', fallback=True))
  File "/usr/lib/throttled/./lenovo_fix.py", line 635, in set_hwp
    read_value = readmsr('IA32_HWP_REQUEST', from_bit=24, to_bit=31)[0]
  File "/usr/lib/throttled/./lenovo_fix.py", line 263, in readmsr
    fatal('Unable to read to MSR {:x}. Unknown error.'.format(msr))
ValueError: Unknown format code 'x' for object of type 'str'

I don't get how in different versions you can add reads that fail which didn't before. Should there not be some kind of automatic fallback list?

goodboy avatar May 25 '21 03:05 goodboy

The exception you are reporting has been fixed in 1f4f552.

Fallback? The MSR at address 0x774 is required for (some part of) this tool to work. If you can't read that MSR then you have some kind of limitation on your system (new kernel, new bios, new EC firmware, etc.). There is not much I can do sorry!

erpalma avatar May 25 '21 13:05 erpalma

Fallback? The MSR at address 0x774 is required for (some part of) this tool to work. If you can't read that MSR then you have some kind of limitation on your system (new kernel, new bios, new EC firmware, etc.). There is not much I can do sorry!

@erpalma this clearly is not true otherwise why would i report that 0.7-1 works?

I do run cutting edge (ish) kernel, on arch linux. I don't think it's that wild to require a version that can cope with that considering most users even bothering to figure out how to use this service are probably using distros near that bleeding.

goodboy avatar May 25 '21 14:05 goodboy

is required for (some part of)

I think we need a patch to make this work in the cases where that reg can't be read because as of right now the de-throttling case I'm after is working great as long as I don't use the newer code.

I'm happy to do testing / make a PR to get this going.

goodboy avatar May 25 '21 14:05 goodboy

Ok I see what you mean. You are not using HWP_Mode right? Then you should comment the line in the config file. In recent versions we added code for restoring the HWP value to default when set to False in config.

I've updated the default config file in master now.

erpalma avatar May 25 '21 15:05 erpalma

I guess I could add a routine for automatically test the ability to read/write each MSR.

erpalma avatar May 25 '21 15:05 erpalma

Something must have changed after this issue was seemingly fixed, since I encountered the very same error message on the most recent version of throttled currently available in Arch Linux repositories. Adding a print("HWP_Mode = " + str(performance_mode)) line revealed that it gets changed to False, even when the value is not defined in the config file (in which case it's supposed to fall back to None).

Moarc avatar Aug 03 '21 18:08 Moarc

Yep, this broke a long time ago. I see that there's been some "fixes", but none of them work. Here's what I recommend to get it working.

Open lenovo_fix.py, (in /usr/lib/throttled/lenovo_fix.py). Replace:

def set_hwp(performance_mode):
    if performance_mode is None:
        return

With:

def set_hwp(performance_mode):
    return

It's a pretty dumb fix, performance_mode appears to be set to something even if HWP_Mode is absent from /etc/lenovo_fix.conf. But in my opinion, that whole thing just should be try/catched with a message saying HWP configuration is unsupported instead of blowing up.

damentz avatar Oct 14 '21 20:10 damentz

Can you please report on the latest commits? I've add a few experimental lines of code for the new testing framework. Right now it is limited to HWP and UNDERVOLTING.

erpalma avatar Oct 30 '21 10:10 erpalma

Yep, this broke a long time ago. I see that there's been some "fixes", but none of them work. Here's what I recommend to get it working.

Open lenovo_fix.py, (in /usr/lib/throttled/lenovo_fix.py). Replace:

def set_hwp(performance_mode):
    if performance_mode is None:
        return

With:

def set_hwp(performance_mode):
    return

It's a pretty dumb fix, performance_mode appears to be set to something even if HWP_Mode is absent from /etc/lenovo_fix.conf. But in my opinion, that whole thing just should be try/catched with a message saying HWP configuration is unsupported instead of blowing up.

can confirm this works on ArchLinux (lastest ZEN kernel) and an i3 5010U

nikp123 avatar Oct 30 '21 21:10 nikp123