MangoHud icon indicating copy to clipboard operation
MangoHud copied to clipboard

cpu_temp using Chipset temperature

Open Kagukara opened this issue 1 year ago • 4 comments

Describe the bug The cpu_temp metric in the mangohud overlay is using my Chipset temperature instead of the CPU Tctl temperature.

List relevant hardware/software information

  • Linux Distribution: Archlinux
  • MangoHud version: v0.7.2-rc3-11-g31f2ca5
  • GPU: AMD Radeon RX 7900 XTX

To Reproduce Steps to reproduce the behavior:

  1. In the terminal run mangohud glxgears
  2. Open a new terminal and run sensors
  3. Check if the CPU temperature is correct

Expected behavior For the cpu_temp metric in the mangohud overlay to show the correct metric.

Screenshots capture_2024-05-10_23-09-51 capture_2024-05-10_23-16-09 capture_2024-05-10_23-16-17

Kagukara avatar May 10 '24 22:05 Kagukara

can you get a tree of the hwmon folder for asusec?

flightlessmango avatar May 13 '24 23:05 flightlessmango

$ tree /sys/class/hwmon/hwmon4/
/sys/class/hwmon/hwmon4/
├── curr1_input
├── curr1_label
├── device -> ../../../asus-ec-sensors
├── fan1_input
├── fan1_label
├── fan2_input
├── fan2_label
├── in0_input
├── in0_label
├── name
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── subsystem -> ../../../../../class/hwmon
├── temp1_input
├── temp1_label
├── temp2_input
├── temp2_label
├── temp3_input
├── temp3_label
└── uevent

Let me know if this is correct.

Kagukara avatar May 13 '24 23:05 Kagukara

Yes that's correct, thank you. can you also get the output of awk '{print FILENAME ": " $0}' *_label in the same folder?

flightlessmango avatar May 13 '24 23:05 flightlessmango

Here you go:

$ awk '{print FILENAME ": " $0}' *_label
curr1_label: CPU
fan1_label: VRM HS
fan2_label: Chipset
in0_label: CPU Core
temp1_label: Chipset
temp2_label: T_Sensor
temp3_label: VRM

Kagukara avatar May 13 '24 23:05 Kagukara

  • Mangohud Version: v0.7.2-13-g41b8761

The cpu_temp metric is still using the chipset temperature instead of the CPU temperature.

My CPU temperature can be found:

$ tree /sys/class/hwmon/hwmon3/
/sys/class/hwmon/hwmon3/
├── curr1_input
├── curr1_label
├── curr2_input
├── curr2_label
├── debug_data
├── device -> ../../../0000:00:18.3
├── in1_input
├── in1_label
├── in2_input
├── in2_label
├── name
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── power1_input
├── power1_label
├── power2_input
├── power2_label
├── subsystem -> ../../../../../class/hwmon
├── temp1_input
├── temp1_label
├── temp1_max
├── temp2_input
├── temp2_label
├── temp3_input
├── temp3_label
├── temp4_input
├── temp4_label
└── uevent
$ awk '{print FILENAME ": " $0}' *_label
curr1_label: SVI2_C_Core
curr2_label: SVI2_C_SoC
in1_label: SVI2_Core
in2_label: SVI2_SoC
power1_label: SVI2_P_Core
power2_label: SVI2_P_SoC
temp1_label: Tdie
temp2_label: Tctl
temp3_label: Tccd1
temp4_label: Tccd2

This maybe because I use zenpower3-dkms for my CPU sensors. Something you might need to account for?

Kagukara avatar Jun 11 '24 14:06 Kagukara

I've uninstalled zenpower3-dkms and it still shows the chipset temperature for CPU temperature.


$ tree /sys/class/hwmon/hwmon3/:

/sys/class/hwmon/hwmon3/
├── device -> ../../../0000:00:18.3
├── name
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── subsystem -> ../../../../../class/hwmon
├── temp1_input
├── temp1_label
├── temp3_input
├── temp3_label
├── temp4_input
├── temp4_label
└── uevent

4 directories, 13 files

$ awk '{print FILENAME ": " $0}' *_label:

temp1_label: Tctl
temp3_label: Tccd1
temp4_label: Tccd2

$ tree /sys/class/hwmon/hwmon4/:

/sys/class/hwmon/hwmon4/
├── curr1_input
├── curr1_label
├── device -> ../../../asus-ec-sensors
├── fan1_input
├── fan1_label
├── fan2_input
├── fan2_label
├── in0_input
├── in0_label
├── name
├── power
│   ├── autosuspend_delay_ms
│   ├── control
│   ├── runtime_active_time
│   ├── runtime_status
│   └── runtime_suspended_time
├── subsystem -> ../../../../../class/hwmon
├── temp1_input
├── temp1_label
├── temp2_input
├── temp2_label
├── temp3_input
├── temp3_label
└── uevent

$ awk '{print FILENAME ": " $0}' *_label:

curr1_label: CPU
fan1_label: VRM HS
fan2_label: Chipset
in0_label: CPU Core
temp1_label: Chipset
temp2_label: T_Sensor
temp3_label: VRM

Not sure why its not working, as it should see the name inside hwmon3 as k10temp, and as there is no Tdie it should use the Tctl.

Screenshot showing mangohud and sensors, with mangohud matching the chipset sensor for asusec-isa-0000 and not Tctl for k10temp-pci-00c3:

capture_2024-06-23_16-59-52

Kagukara avatar Jun 23 '24 15:06 Kagukara

Can you get the mangohud logs with MANGOHUD_LOG_LEVEL=debug mangohud vkcube?

flightlessmango avatar Jun 24 '24 11:06 flightlessmango

I've copied the output to a txt file, as its too long (obnoxious) to paste into github.

Here is the cpu.cpp section though for quick viewing:

[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: iwlwifi_1
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: asusec
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:490] fallback cpu temp input: /sys/class/hwmon/hwmon4/temp1_input
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:539] hwmon: using input: /sys/class/hwmon/hwmon4/temp1_input
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: iwlwifi_1
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: asusec
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: nvme
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: amdgpu
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: nct6798
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: asus
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: zenpower
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:586] hwmon: using input: /sys/class/hwmon/hwmon3/power1_input
[2024-06-24 23:59:14.511] [MANGOHUD] [debug] [cpu.cpp:587] hwmon: using input: /sys/class/hwmon/hwmon3/power2_input

mangohud_log_level_debug.txt

Kagukara avatar Jun 24 '24 23:06 Kagukara

@flightlessmango So I edited out lines 528 to 531 for "asusec" in cpp.cpu, built and installed mangohud using the the build.sh script. The CPU temperature is now matching Tdie/Tctl in sensors.

https://github.com/flightlessmango/MangoHud/blob/2d0c0a1b3cd0a9949ac821204da61475a10218cc/src/cpu.cpp#L528-L531

Screenshot showing mangohud and sensors, with mangohud matching the Tdie/Tctl in sensors:

capture_2024-06-27_07-57-20

Here is the cpu.cpp section for MANGOHUD_LOG_LEVEL=debug mangohud vkcube:

[2024-06-27 07:49:50.913] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: iwlwifi_1
[2024-06-27 07:49:50.913] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: asusec
[2024-06-27 07:49:50.913] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: nvme
[2024-06-27 07:49:50.913] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: amdgpu
[2024-06-27 07:49:50.913] [MANGOHUD] [debug] [cpu.cpp:507] hwmon: sensor name: nct6798
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:539] hwmon: using input: /sys/class/hwmon/hwmon7/temp13_input
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: iwlwifi_1
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: asusec
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: nvme
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: amdgpu
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: nct6798
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: asus
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:632] hwmon: sensor name: zenpower
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:586] hwmon: using input: /sys/class/hwmon/hwmon3/power1_input
[2024-06-27 07:49:50.976] [MANGOHUD] [debug] [cpu.cpp:587] hwmon: using input: /sys/class/hwmon/hwmon3/power2_input

Not sure why this temporarily fixes the problem though.

Kagukara avatar Jun 27 '24 07:06 Kagukara

index 51c2570..5a360c0 100644
--- a/src/cpu.cpp
+++ b/src/cpu.cpp
@@ -526,8 +526,8 @@ bool CPUStats::GetCpuFile() {
                 break;

         } else if (name == "asusec") {
-            find_input(path, "temp", input, "CPU");
-            break;
+            if (find_input(path, "temp", input, "CPU"))
+                break;
         } else {
             path.clear();
         }

Can you try this patch?

flightlessmango avatar Jun 27 '24 10:06 flightlessmango

That worked, thank you.

Kagukara avatar Jun 27 '24 10:06 Kagukara

fixed here 8a31b967669576268d09e8efc604108c28ab3d87

flightlessmango avatar Jun 27 '24 10:06 flightlessmango