SysMonTask icon indicating copy to clipboard operation
SysMonTask copied to clipboard

Add function to identify cpu temperature for unkown sensors

Open bastian-src opened this issue 3 years ago • 17 comments

With my setup, the core temperature could not be read correctly which leads to NA as CPU temperature. I found this discussion of yours and tried to print my temperature via ps.util. The output is the following:

>>> ps.sensors_temperatures()
{'k10temp': [shwtemp(label='', current=22.25, high=70.0, critical=90.0)], 
 'f71889a': [shwtemp(label='', current=35.0, high=85.0, critical=100.0),
             shwtemp(label='', current=41.0, high=85.0, critical=107.0), 
             shwtemp(label='', current=32.0, high=70.0, critical=85.0)]}

As you can find here the k10temp does not return the real temperature for me, but more of a relative value to control fan speed. Accordingly, I figured out that 'f71889a'[0] is my "real" cpu temperature which comes from the mainboard. Since 'f71889a' is a completely shitty identifier to search for (and I'm quite certain that it differs when you have a different CPU/Mainboard), I added a function which searches for the same identifier in ps.sensors_fans(). Because usually, your CPU also has a fan which has the same identifier as the temperature sensor.

I know this is kind of a hacky way which may not return the correct core temperature in any case, so I can understand if you don't want to merge this PR. Nevertheless, I think it doesn't really hurt to use this method as a last way if the previous identifiers 'coretemp' and 'k10temp' don't match.

So, here is my setup:

                                     ......            user@rechenknecht 
     .,cdxxxoc,.               .:kKMMMNWMMMNk:.        ------------------ 
    cKMMN0OOOKWMMXo. ;        ;0MWk:.      .:OMMk.     OS: openSUSE Tumbleweed x86_64 
  ;WMK;.       .lKMMNM,     :NMK,             .OMW;    Host: MSI 990FXA-GD65 (MS-7640 3.0) 
 cMW;            'WMMMN   ,XMK,                 oMM'   Kernel: 5.12.0-1-default 
.MMc               ..;l. xMN:                    KM0   Uptime: 1 hour, 36 mins 
'MM.                   'NMO                      oMM   Packages: 3201 (rpm), 9 (flatpak) 
.MM,                 .kMMl                       xMN   Shell: bash 5.1.4 
 KM0               .kMM0. .dl:,..               .WMd   Resolution: 1920x1080, 1920x1080 
 .XM0.           ,OMMK,    OMMMK.              .XMK    DE: GNOME 40.0 
   oWMO:.    .;xNMMk,       NNNMKl.          .xWMx     WM: Mutter 
     :ONMMNXMMMKx;          .  ,xNMWKkxllox0NMWk,      WM Theme: Adwaita 
         .....                    .:dOOXXKOxl,         Theme: Adwaita [GTK2/3] 
                                                       Icons: Adwaita [GTK2/3] 
                                                       Terminal: terminator 
                                                       CPU: AMD FX-8350 (8) @ 4.000GHz 
                                                       GPU: NVIDIA GeForce GTX 1050 Ti 
                                                       Memory: 3225MiB / 15878MiB

And the changes of this PR:

  • Add function cpuTempByFanMatching to identify cpu temperature by matching cpu name and fan name
  • Add function helperMatchFirstKey to compare cpu sensor names with fan sensor names
  • Call cpuTempByFanMatching and add result into temperature label

bastian-src avatar May 02 '21 08:05 bastian-src

Hi, I understood what you have done, but there is a little catch using this approach which flops the whole idea. The API psutil.sensors_fan() doesn't seems to work for all, even on my setup(pop os 20.10, MSI gl63 8rc) psutil.sensors_fan() returns NONE. And yes it will fail. But need to look at some other approach in the future. Nevertheless thankyou, i will see if I can do anything about it.

KrispyCamel4u avatar May 02 '21 08:05 KrispyCamel4u

Oh okay, that sucks. But in case psutil.sensors_fan() returns None, my implementation should just run into the except & not crash the application. So maybe it's still a good approach to identify the CPU temp on setups like mine.

Otherwise a different library could be a good way or using the lm_sensors system command (although I think it's also not so nice to update the CPU temp every time via a system call).

bastian-src avatar May 02 '21 08:05 bastian-src

What about checking which of ps.sensors_temperatures() returns multiple shwtemp objects? CPUs usually have multiple temperature sensors. Ofc, it would be more of a hack than a confident way to determine the CPU temperature.

bastian-src avatar May 02 '21 09:05 bastian-src

What about checking which of ps.sensors_temperatures() returns multiple shwtemp objects? CPUs usually have multiple temperature sensors. Ofc, it would be more of a hack than a confident way to determine the CPU temperature.

As you can see from #22 many identifiers result in multiple shwtemp objects. I was thinking of working as lm_sensors work, i.e., finding how lm_sensors get all of its data by checking source code. If you check its source code that would be a lot of help.

KrispyCamel4u avatar May 02 '21 09:05 KrispyCamel4u

What do you think about the following?

A. Read /sys/..

  • Make system call
  • Read /sys/class/thermal/thermal_zone0/temp

B. Use PySensors (Not actively maintained)

  • Use this easy-to-implement Python library which is available via pip
  • Uses lm_sensors data
  • Code link is not available anymore
  • Does not seem to be maintained anymore

C. Use PySensors (Actively maintained)

  • Use this not-so-easy-to-implement Python module which is not available via pip and has the same naming as the previous
  • Uses lm_sensors data as well
  • Code is still available and seems to be maintained actively on GitHub

I think option B is the easiest way to get reliable results. We just have to test whether it works fine. Option C would be cool since it is still maintained, but it's far more difficult to implement since we have to build it on our own.

bastian-src avatar May 03 '21 21:05 bastian-src

Can you share what you have in the directory: /sys/class/thermal/ ? and also the thermalzone_x/tempandthermalzone_x/type`? The fact that Option B is not maintained and even the source code is not accessible I don't think this is a good choice. Option C is a lil difficult but we have access to source code and there are only 4-5 file to search to get the whole idea.

I would like to go with option A for the short term if we can get some similarities and if it is easy to get temperature. As far as C, that might be a long term solution. Nevertheless, I have following in /sys/class/thermal/: image

Can you share what you have in the directory: /sys/class/thermal/ ? and also the thermalzone_x/tempandthermalzone_x/type`?

Also, Can you run the sample code here and share the output.

KrispyCamel4u avatar May 04 '21 12:05 KrispyCamel4u

Unfortunately, I tested the /sys/class/thermal part on my notebook and not on my PC in the first place. So, when I have a look at the directory on my PC, it just gives me cooling_deviceX instead of thermal_zone: image

None of them give me any temperature information:

image

According to this conversation, another approach is /sys/class/hwmon/hwmonX. This one gives me three devices:

image

On both of my systems (notebook and PC), the platform-symlink holds the CPU temperature. For my notebook, it is directly: /sys/class/hwmon/hwmon4/temp1_input. On my PC, I have to use the following: /sys/class/hwmon/hwmon2/device/temp1_input. According to this, one could search for the symlink in /sys/class/hwmon/ which goes to platform and check whether it provides directly temp1_input or via device/temp1_input.

Still, idk how reliable this is. Nevertheless, the PySensors example code leads to the following output on my system:

image

bastian-src avatar May 05 '21 14:05 bastian-src

@KrispyCamel4u what do you think about the hwmon approach?

bastian-src avatar May 11 '21 11:05 bastian-src

@KrispyCamel4u what do you think about the hwmon approach?

This one does provide us with temps. But just to be sure, how the temperature label is determined to be of cpu in hwmon directory?

KrispyCamel4u avatar May 14 '21 05:05 KrispyCamel4u

This one does provide us with temps. But just to be sure, how the temperature label is determined to be of cpu in hwmon directory?

By checking the symlink name. If it has platform in its path, it is the CPU. Take a look at this discussion.

bastian-src avatar May 14 '21 06:05 bastian-src

@KrispyCamel4u shall I provide an implementation?

bastian-src avatar May 18 '21 13:05 bastian-src

Hi, first of all sorry for the late reply I was busy with endterms. I checked what you told me about checking the platform symlink and it works. Yeah sure, you can implement that it will be a great help. Like as a second pref if already implemented code can't find the temps.

KrispyCamel4u avatar May 18 '21 14:05 KrispyCamel4u

I can totally relate, it will take some time for me to get started with the implementation since I have some upcoming deadlines as well. I will ping you when I'm done with the implementation - good luck with your endterms :v:

bastian-src avatar May 20 '21 21:05 bastian-src

hi, Are you still interested in providing CPU temperature patch?

KrispyCamel4u avatar Sep 15 '21 05:09 KrispyCamel4u

Sure! But maybe we should revisit the approach. I found pyspectator which could be an alternative to reading from hwmon directly. What do you think about it? Does it work for your hardware?

This example worked for me:

>>> from pyspectator.processor import Cpu
>>> from time import sleep
>>> cpu = Cpu(monitoring_latency=1)
>>> with cpu:
...     for _ in range(8):
...        cpu.load, cpu.temperature
...        sleep(1.1)
...

bastian-src avatar Sep 15 '21 11:09 bastian-src

pyspectator uses these files to get temps: files = [ '/sys/devices/LNXSYSTM:00/LNXTHERM:00/LNXTHERM:01/thermal_zone/temp', '/sys/bus/acpi/devices/LNXTHERM:00/thermal_zone/temp', '/proc/acpi/thermal_zone/THM0/temperature', '/proc/acpi/thermal_zone/THRM/temperature', '/proc/acpi/thermal_zone/THR1/temperature' ]

I will try later this evening to see if it works for my system.

KrispyCamel4u avatar Sep 15 '21 11:09 KrispyCamel4u

Hi, I have gone through the pyspectotor approach and tried on my system. It works but at the same time, it doesn't. In my system, from the files that pyspectator looks into only file: /sys/bus/acpi/devices/LNXTHERM:00/thermal_zone/temp' is found. This file show contains temperature related to acpitz(which is basically the CPU socket(or somewhere near on motherboard) sensor's label, but not for actual cpu temperature). The temperature shown by pyspectator and lm-sensor/psutil differs in light loaded scenarios, when load on cpu rises(it gets hotter) the difference decreases and becomes almost equal to actual cpu package temp because the outside(cpu socket) becomes hot also as the cooling system takes more time to extract large amount of heat and both cpu and socket temperature gets to same value.

If in worst case, we wanna show this psutil is a better option to collect data.

What are the directories that is available on system and can you check what all these temp correspond to?

KrispyCamel4u avatar Sep 16 '21 06:09 KrispyCamel4u