GPU power draw shows wrong numbers with 32bit applications
Describe the bug 32bit games/applications show powerdraw in 100k numbers or more.
List relevant hardware/software information
- Arch Linux
- MangoHud version: v0.8.1-4-gcd7c6cb
- GPU: nvidia RTX 3080, 570.124.04 drivers
To Reproduce Steps to reproduce the behavior:
- open mangohud with any 32bit application like glxgears32 for example for quick test
Screenshots 32bit glxgears:
64bit glxgears:
edit: it actually shows correct powerdraw for 1-2 sec and then goes bust
This is Alice: Madness Returns, a 2011 year-old game, played on Steam with Proton.
I'm using Arch Linux, with Wayland + Sway, and with the following versions of Mangohud downloaded from the official Arch repositories: extra/mangohud 0.8.1-1 [installed] multilib/lib32-mangohud 0.8.1-1 [installed]
The followings are my PC's specs: Nvidia RTX 3050 6 gb Intel i7-7700 16 gb DDR4 2400 MHz
Games like Minecraft (native), Hollow Knight (native), Hades (Proton), Resident Evil 3 Remake (Proton), Severed Steel (GOG-Lutris-Proton) played all fine, maybe it could be an issue related with the 32-bit build.
Have I missed some steps to fix the random wattage values from the GPU ?
Same issue on Fedora Wayland Nvidia RTX 3080 12GB (Driver: 570.133.07) Mangohud version: 0.8.1-2
it might be 3000 series GPUs only issue. I seen people on 4000 series don't have this issue. So it could be some strange nvidia bug, but then again nvidia-smi reports all correctly
it might be 3000 series GPUs only issue.
I have a 4070 and it is happening for me as well
diff --git a/src/nvidia.cpp b/src/nvidia.cpp
index b983656..e620387 100644
--- a/src/nvidia.cpp
+++ b/src/nvidia.cpp
@@ -118,9 +118,16 @@ void NVIDIA::get_instant_metrics_nvml(struct gpu_metrics *metrics) {
if (params->enabled[OVERLAY_PARAM_ENABLED_gpu_power] || (logger && logger->is_active())) {
unsigned int power, limit;
- nvml.nvmlDeviceGetPowerUsage(device, &power);
+ nvmlReturn_t ret_power = nvml.nvmlDeviceGetPowerUsage(device, &power);
+ if (ret_power != NVML_SUCCESS) {
+ spdlog::debug("nvmlDeviceGetPowerUsage failed: {}", nvml.nvmlErrorString(ret_power));
+ power = 0;
+ } else {
+ spdlog::debug("Raw power usage (mW): {}", power);
+ metrics->powerUsage = power / 1000;
+ }
+
nvml.nvmlDeviceGetPowerManagementLimit(device, &limit);
- metrics->powerUsage = power / 1000;
metrics->powerLimit = limit / 1000;
}
Can someone get logs with this patch please?
I'm not sure what i'm doing wrong or how to get logs from this. I built lib32-mangohud with the patch, triple checked its patched, but i have no change and nothing pops out into terminal either. should i enable some sort of secret debug option?
edit: all i get is [2025-04-11 13:02:39.875] [MANGOHUD] [info] [gpu.cpp:98] Set renderD128 as active GPU (id=10de:2216 pci_dev=0000:0a:00.0)
i'm maybe stupid ofc and missing something super obvious
Oh yeah sorry you need to use MANGOHUD_LOG_LEVEL=debug
Ahaa.. is it because 570.133 drivers don't seem to have compatible 32bit libxnvctrl
[2025-04-12 08:31:59.404] [MANGOHUD] [debug] [loader_nvctrl.cpp:39] Failed to open 32bit libXNVCtrl.so.0: libXNVCtrl.so.0: cannot open shared object file: No such file or directory
full log with glxgears32: https://pastebin.com/sHJVZbEQ
This looks like driver regression to me. API clearly states that values should be reported in mW, and these numbers dont even resemble any power value, just randomness
Try older version drivers, preferably on a spare os, as to not break your current one.
Mangohud can't be at fault here, because it doesnt manipulate power values (except dividng by 1000 for unit conversion), just takes it straight from nvml and immediately stores it
Sadly 565 drivers wont compile for kernel 6.14. Too much trouble to patch it or downgrade kernel for it to try, but yeah i think its a regression in the driver as it used to work fine before. my guess is that the regression happened with 570 drivers
Ahaa.. is it because 570.133 drivers don't seem to have compatible 32bit libxnvctrl
[2025-04-12 08:31:59.404] [MANGOHUD] [debug] [loader_nvctrl.cpp:39] Failed to open 32bit libXNVCtrl.so.0: libXNVCtrl.so.0: cannot open shared object file: No such file or directoryfull log with glxgears32: https://pastebin.com/sHJVZbEQ
For the good measure try 0.7.2 mangohud on 32-bit
it might be 3000 series GPUs only issue. I seen people on 4000 series don't have this issue. So it could be some strange nvidia bug, but then again nvidia-smi reports all correctly
nvidia-smi is a 64-bit application, that's probably why
For the good measure try 0.7.2 mangohud on 32-bit
same issue with 0.7.2 so yeah probably nvidia regression or something
Ok with 575.51.02 drivers its still broken, but now it reports just 8mW
[2025-04-16 22:15:23.005] [MANGOHUD] [debug] Raw power usage (mW): 8
[2025-04-16 22:15:23.031] [MANGOHUD] [debug] Raw power usage (mW): 8
[2025-04-16 22:15:23.058] [MANGOHUD] [debug] Raw power usage (mW): 8
[2025-04-16 22:15:23.085] [MANGOHUD] [debug] Raw power usage (mW): 8
[2025-04-16 22:15:23.111] [MANGOHUD] [debug] Raw power usage (mW): 8
I guess this can be closed maybe? or leave it open until nvidia resolves it?.. i reported it here also: https://forums.developer.nvidia.com/t/570-release-feedback-discussion/321956/497?u=xpander
Let's leave it open until it's resolved by nvidia