htop
htop copied to clipboard
linux: assign CPU temperatures by package/core or CCD
Closes https://github.com/htop-dev/htop/issues/806. Closes https://github.com/htop-dev/htop/issues/1048. Closes https://github.com/htop-dev/htop/pull/1176. Closes https://github.com/htop-dev/htop/issues/1335. Addresses https://github.com/htop-dev/htop/issues/879 (on Linux).
Fixed build without libsensors.
Fixed assignment on RPi.
Note that the cpu are only parsed once, this takes <0.1ms on my machine.
I think this is ready to go, should I squash it together sensibly?
I think this is ready to go, should I squash it together sensibly?
As you like. Just took a quick peek at the list of commits and some suggest that they mostly rectify things changed in previous commits of the PR: Those are the prime subjects to squash in the PR right now. So there's not really much to squash together right now AFAICS.
@leahneukirchen Anything left to do from your side? If not please mark the PR ready for review. TIA.
@leahneukirchen Do you want to implement the three review comments from @cgzones ?
I somehow dislike that the sensors are iterated twice: once in LibSensors_countCCDs() and once in LibSensors_getCPUTemperatures().
Do we need to compute the number of ccds beforehand, what about adjusting the CPU data after iterating all sensors (and counting all CCD ones) in LibSensors_getCPUTemperatures() ?
I somehow dislike that the sensors are iterated twice: once in
LibSensors_countCCDs()and once inLibSensors_getCPUTemperatures(). Do we need to compute the number of ccds beforehand, what about adjusting the CPU data after iterating all sensors (and counting all CCD ones) inLibSensors_getCPUTemperatures()?
Seconded, libsensors is slow
Tested on my notebook with Alder Lake CPU... It shows temperatures for all cores at least.
Tested on an AMD EPYC 7F52 (16 core, 32 threads, 16x1 CCX), seems to work. At least plausible looking temperatures are now shown for all cores.
AFAICS this is waiting for some refactoring work, but that work is yet not fully specified/explained. @cgzones can you please elaborate the open discussion threads?.
Note that sensors are iterated twice on first start only, later only to read the temperatures. (Patches welcome, I find it hard to rewrite this to use one pass.)
If this is a one-time event it sounds like a reasonable tradeoff, compared to (probably) more complex code. I have not looked at it myself, though.
Oh, this crashes with a segmentation fault if libsensors.so is not available...
Sigh. With Kernel 6.9 there seems to be a bug in the logic now, my i7-1355U isn't mapped correctly anymore...
With kernel 6.9.5 I get:
% pcat /sys/class/hwmon/hwmon2/*label
/sys/class/hwmon/hwmon2/temp1_label Package id 0
/sys/class/hwmon/hwmon2/temp2_label Core 0
/sys/class/hwmon/hwmon2/temp6_label Core 4
/sys/class/hwmon/hwmon2/temp10_label Core 8
/sys/class/hwmon/hwmon2/temp11_label Core 9
/sys/class/hwmon/hwmon2/temp12_label Core 10
/sys/class/hwmon/hwmon2/temp13_label Core 11
/sys/class/hwmon/hwmon2/temp14_label Core 12
/sys/class/hwmon/hwmon2/temp15_label Core 13
/sys/class/hwmon/hwmon2/temp16_label Core 14
/sys/class/hwmon/hwmon2/temp17_label Core 15
Note that the tempID are not sequential.
Thanks a lot for the quick fix!
On my laptop with a Ryzen 5500U, there's only a single Tctl measurement for the CPU:
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +47.4°C
According to the kernel doc's quote of the AMD manual:
It does not represent an actual physical temperature like die or case temperature.
But it appears somewhat representative, at least here, so it might make sense to include it.
I just checked on a server with AMD EPYC 7502, which supposedly (no idea whether it's correct!) has 2 sockets x 4 CCD x 2 CCX x 4 cores = 64 cores. The sockets are exposed as separate k10temp instances:
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +38.4°C
Tccd1: +35.0°C
Tccd3: +36.8°C
Tccd5: +35.8°C
Tccd7: +35.8°C
k10temp-pci-00cb
Adapter: PCI adapter
Tctl: +39.8°C
Tccd1: +38.0°C
Tccd3: +39.0°C
Tccd5: +37.5°C
Tccd7: +40.0°C
Just having one sensor works already, no?
Just having one sensor works already, no?
Depends on what you mean by "works". It shows a temperature value for each core, but I have no idea whether it matches. FWICT both sensors and its CCDs are read in whatever order sensors provides, which might or might not be related to the CPU core numbers...
I tested on a AMD EPYC 7443 that when i lock processes to some cores, only these heat up.