MangoHud reports two GPUs on one GPU system
Do not report issue for old MangoHud versions
Describe the bug MangoHud reports two GPUs when there really no second GPU, only one GPU which is iGPU is actually present
List relevant hardware/software information
- Linux Distribution: Alpine Linux edge
- MangoHud version: v0.8.1
- GPU: Intel Ultra High Definition 600
- CPU: Intel Celeron N4000 (two cores)
- Kernel: Linux 6.15-rc6
To Reproduce Steps to reproduce the behavior:
- Compile MangoHud from master branch at commit "amdgpu: always clear the TEMP_HOTSPOT throttling flag bit" (d297a57fc2c)
- Run MangoHud on glxgears
- Witness MangoHud hallucinating other non existent GPU
Expected behavior Only one GPU reported
Screenshots
Should be resolved here d1a7096
@flightlessmango No, it doesn't work. Still two GPUs reported.
On other hand should I create new issue or keep it here? because my iGPU frequency also reported less accurate than previous version and iGPU power does not report at all (set 0.0 W regardless of load). The "second ghost GPU" statistics are all 0
-
Regarding two gpus, do
ls /sys/class/drm -
power usage is not available because of lack of permissions on Intel integrated graphics (check out https://github.com/flightlessmango/MangoHud#metrics-support-by-gpu-vendordriver)
it works in 0.7.1 because it uses
intel_gpu_topwhich is launched with necessary permissions and since 0.8.0 mangohud is not usingintel_gpu_topanymore, instead manually reading all the required information.the problem however arises, where some metrics are not available without root rights or certain capabilities[1].
and mangohud can't gain neither root or any capabilities because it exists completely inside the game, so to give root rights or capabilities to mangohud, means you need to give it to the game, which is not possible
so to gain back power usage for iGPU mangohud would need to either start using
intel_gpu_topagain or it needs to work as a separate process, which is not currently possible (it wasn't written this way)
[1] - capabilities is a fine-grained permission system which gives the ability to launch programs as normal user, but with some additional root rights without gaining full root access (https://wiki.archlinux.org/title/Capabilities)
- The result is
~ $ ls /sys/class/drm/
card0 card0-HDMI-A-1 card1 renderD129
card0-DP-1 card0-eDP-1 renderD128 version
- About power usage, what permission it needs? because my iGPU did reports power usage as seen on older mangohud through using
intel_gpu_top
And what about the clock? it rounded out to 100 Mhz steps while older one get it down to 1 Mhz step
-
There is your problem, you have two render devices: renderD128 and renderD129. Do
ls -l /sys/class/drm/renderD*/device/driver -
intel_gpu_top needs either root or CAP_PERFMON to work. Mangohud doesn't use intel_gpu_top anymore
The reason why GPU frequency is rounded to nearest 100mhz is because Intel reports it that way. To get it down to 1mhz you have to use Intel's debugfs interface which requires root
- The output is
~ $ ls /sys/class/drm/
card0 card0-HDMI-A-1 card1 renderD129
card0-DP-1 card0-eDP-1 renderD128 version
~ $ ls -l /sys/class/drm/renderD*/device/driver
lrwxrwxrwx 1 root root 0 May 30 17:40 /sys/class/drm/renderD128/device/driver -> ../../../bus/pci/drivers/i915
~ $
- So the 100 Mhz interval is the intended way by Intel? also the 1 Mhz need root part, partly wasnt true atleast on my system because
intel_gpu_topworks fine on me without root
-
do
ls -l /sys/class/drm/renderD*/ -
About 100mhz, I guess it is intended as to not give out too much info because intel linux devs consider a lot of things "information leak", hence requiting root everywhere (why they apply this approach only to linux and not windows is a mystery to me)
Is your
intel_gpu_topsetcapped? Dogetcap $(which intel_gpu_top)
Here the result for first and second
~/MyPullRequests/MangoHud $ ls -l /sys/class/drm/renderD*/
/sys/class/drm/renderD128/:
total 0
-r--r--r-- 1 root root 4096 May 31 09:46 dev
lrwxrwxrwx 1 root root 0 May 31 09:43 device -> ../../../0000:00:02.0
drwxr-xr-x 2 root root 0 May 31 09:46 power
lrwxrwxrwx 1 root root 0 May 31 09:43 subsystem -> ../../../../../class/drm
-rw-r--r-- 1 root root 4096 May 31 09:43 uevent
/sys/class/drm/renderD129/:
total 0
-r--r--r-- 1 root root 4096 May 31 09:46 dev
lrwxrwxrwx 1 root root 0 May 31 09:43 device -> ../../../vgem
drwxr-xr-x 2 root root 0 May 31 09:46 power
lrwxrwxrwx 1 root root 0 May 31 09:43 subsystem -> ../../../../../class/drm
-rw-r--r-- 1 root root 4096 May 31 09:43 uevent
~/MyPullRequests/MangoHud $ getcap "$(which intel_gpu_top)"
~/MyPullRequests/MangoHud $
VGEM is the Virtual GEM provider and has been around for a while as a minimal non-hardware backed Graphics Execution Manager (GEM) memory management service. It's used by LLVMpipe and other non-native 3D driver scenarios for buffer sharing. VGEM is good for improved software rasterizer performance and has been part of the mainline kernel for the better part of a decade.
This is your second "gpu"
Do cat /proc/sys/kernel/perf_event_paranoid as I found out this also allows intel_gpu_top to run without root
Okay, so to run intel_gpu_top on Linux Mint without root, I need:
-
sudo setcap cap_perfmon=+ep $(which intel_gpu_top) -
echo 3 | sudo tee /proc/sys/kernel/perf_event_paranoid
okay... I have
$ cat /proc/sys/kernel/perf_event_paranoid
0
$
VGEM is the Virtual GEM provider and has been around for a while as a minimal non-hardware backed Graphics Execution Manager (GEM) memory management service. It's used by LLVMpipe and other non-native 3D driver scenarios for buffer sharing. VGEM is good for improved software rasterizer performance and has been part of the mainline kernel for the better part of a decade.
This is your second "gpu"
oh that makes sense... so mangohud now shows two GPU instead one is related to intel_gpu_top changes (because directly accesses the sysfs)?
okay... I have
$ cat /proc/sys/kernel/perf_event_paranoid 0 $
In your case you have almost all access to all perf events, maybe that's why intel_gpu_top works without root for you
But it's not a default value for almost all distros
Regarding second "gpu", I'll add a check to skip any renderD* device that mmangohud doesn't support
In your case you have almost all access to all perf events, maybe that's why intel_gpu_top works without root for you
But it's not a default value for almost all distros
that might explain it, I set it to 0 so I can use 'perf record' to profile my projects unprivileged and didn't expect that it relates to i915 GPU stuffs
Actually there is two ways to launch intel_gpu_top without root:
-
sudo setcap cap_perfmon=+ep $(which intel_gpu_top)echo 3 | sudo tee /proc/sys/kernel/perf_event_paranoid -
echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid
okie!