nvtop icon indicating copy to clipboard operation
nvtop copied to clipboard

Missing support for reporting Intel GPU memory, power, fan and temperature

Open ich777 opened this issue 2 years ago • 14 comments
trafficstars

Hi, I'm trying to build NVTOP for Slackware but sadly enough it gives me this output after installing the compiled version:

This version of Nvtop is missing support for reporting Intel GPU memory, power,
fan and temperature

                            <Don't Show Again> <Ok>
         Press Enter to select, arrows ">" and "<" to switch options

This is the output from cmake:

cmake .. -DNVIDIA_SUPPORT=ON -DAMDGPU_SUPPORT=ON -DINTEL_SUPPORT=ON -DCMAKE_INSTALL_PREFIX=/usr
- The C compiler identification is GNU 11.2.0
-- The CXX compiler identification is GNU 11.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to 'Release' as none was specified.
-- Looking for cbreak in /usr/lib64/libncursesw.so
-- Looking for cbreak in /usr/lib64/libncursesw.so - found
-- Found Curses: /usr/lib64/libncursesw.so  
-- Performing Test HAS_REALLOCARRAY
-- Performing Test HAS_REALLOCARRAY - Success
-- Found UDev: /usr/lib64/libudev.so (found version "243") 
-- Libudev stable: TRUE
-- Could NOT find Systemd (missing: SYSTEMD_LIBRARY SYSTEMD_INCLUDE_DIR) (found version "")
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.2") 
-- Found Libdrm: /usr/lib64/libdrm.so (found version "2.4.109") 
-- Found libdrm; Enabling AMDGPU support
-- Performing Test compiler_has-Wall
-- Performing Test compiler_has-Wall - Success
-- Performing Test compiler_has-Wextra
-- Performing Test compiler_has-Wextra - Success
-- Performing Test compiler_has-Waddress
-- Performing Test compiler_has-Waddress - Success
-- Performing Test compiler_has-Waggressive-loop-optimizations
-- Performing Test compiler_has-Waggressive-loop-optimizations - Success
-- Performing Test compiler_has-Wbad-function-cast
-- Performing Test compiler_has-Wbad-function-cast - Success
-- Performing Test compiler_has-Wmissing-declarations
-- Performing Test compiler_has-Wmissing-declarations - Success
-- Performing Test compiler_has-Wmissing-parameter-type
-- Performing Test compiler_has-Wmissing-parameter-type - Success
-- Performing Test compiler_has-Wmissing-prototypes
-- Performing Test compiler_has-Wmissing-prototypes - Success
-- Performing Test compiler_has-Wnested-externs
-- Performing Test compiler_has-Wnested-externs - Success
-- Performing Test compiler_has-Wold-style-declaration
-- Performing Test compiler_has-Wold-style-declaration - Success
-- Performing Test compiler_has-Wold-style-definition
-- Performing Test compiler_has-Wold-style-definition - Success
-- Performing Test compiler_has-Wstrict-prototypes
-- Performing Test compiler_has-Wstrict-prototypes - Success
-- Performing Test compiler_has-Wpointer-sign
-- Performing Test compiler_has-Wpointer-sign - Success
-- Performing Test compiler_has-Wdouble-promotion
-- Performing Test compiler_has-Wdouble-promotion - Success
-- Performing Test compiler_has-Wuninitialized
-- Performing Test compiler_has-Wuninitialized - Success
-- Performing Test compiler_has-Winit-self
-- Performing Test compiler_has-Winit-self - Success
-- Performing Test compiler_has-Wstrict-aliasing
-- Performing Test compiler_has-Wstrict-aliasing - Success
-- Performing Test compiler_has-Wsuggest-attribute-const
-- Performing Test compiler_has-Wsuggest-attribute-const - Success
-- Performing Test compiler_has-Wtrampolines
-- Performing Test compiler_has-Wtrampolines - Success
-- Performing Test compiler_has-Wfloat-equal
-- Performing Test compiler_has-Wfloat-equal - Success
-- Performing Test compiler_has-Wshadow
-- Performing Test compiler_has-Wshadow - Success
-- Performing Test compiler_has-Wunsafe-loop-optimizations
-- Performing Test compiler_has-Wunsafe-loop-optimizations - Success
-- Performing Test compiler_has-Wfloat-conversion
-- Performing Test compiler_has-Wfloat-conversion - Success
-- Performing Test compiler_has-Wlogical-op
-- Performing Test compiler_has-Wlogical-op - Success
-- Performing Test compiler_has-Wnormalized
-- Performing Test compiler_has-Wnormalized - Success
-- Performing Test compiler_has-Wdisabled-optimization
-- Performing Test compiler_has-Wdisabled-optimization - Success
-- Performing Test compiler_has-Whsa
-- Performing Test compiler_has-Whsa - Success
-- Performing Test compiler_has-Wunused-result
-- Performing Test compiler_has-Wunused-result - Success
-- Performing Test compiler_has-Werror-implicit-function-declaration
-- Performing Test compiler_has-Werror-implicit-function-declaration - Success
-- Performing Test compiler_has-Wformat
-- Performing Test compiler_has-Wformat - Success
-- Performing Test compiler_has-Wformat-security
-- Performing Test compiler_has-Wformat-security - Success
-- Performing Test linker_has-Wl_-z_relro
-- Performing Test linker_has-Wl_-z_relro - Success
-- Could NOT find GTest (missing: GTEST_LIBRARY GTEST_INCLUDE_DIR GTEST_MAIN_LIBRARY) 
-- Configuring done
-- Generating done

I'm on Kernel version 6.1.12 Am I missing something obvious?

Cheers

ich777 avatar Mar 06 '23 08:03 ich777

Same error here, build from source GPU: Intel TigerLake-H GT1 [UHD Graphics] with modesetting driver

Fijxu avatar Mar 12 '23 01:03 Fijxu

try running it as sudo. that seems to work for me. but when launched without sudo it says it doesnt have support for it

pbanj avatar Apr 03 '23 01:04 pbanj

I'm already root so sudo wouldn't do much.

Are you also sure that you've not already have set it to not display the message?

I already get a output but the message is what made me ask. grafik

ich777 avatar Apr 03 '23 06:04 ich777

I'm seeing this on the Arch package as well as on the AppImage so I doubt it's a build issue. Are Intel cards just not supported in general, or just particular models? I have an Intel Arc A770 for the record.

pepijndevos avatar Apr 09 '23 10:04 pepijndevos

Hello, Yes, these information were not exposed by the driver when I implemented the Intel support. I'll look at the state of the current Linux driver to see if the patches got mainlined and add support for that.

Syllo avatar Apr 12 '23 10:04 Syllo

@Syllo A little OT but I completely forgot to mention that I've created a plugin for Unraid for nvtop over here.

I hope that's okay for you. :) It was downloaded about 7300 times so far, so this means that about 7300 people are using it on Unraid.

ich777 avatar Apr 13 '23 13:04 ich777

I can confirm this on Arch with 6.1.24-1-lts kernel and modesetting driver on TigerLake-H GT1 iGPU. The only stat NVTOP can display is the clock rate, while intel_gpu_top displays other stats as well - see the attached screenshot. Running NVTOP with sudo yields the same results. 2023-04-15_16-18

K4ktus123 avatar Apr 15 '23 14:04 K4ktus123

All right. While browsing the kernel code I uncovered two piece of info:

  1. Newest Intel GPUs have a "Graphics micro (μ) Controller (GuC)". If this GuC is active, you might not see some media related workload usage in nvtop since the reporting is not implemented/activated even in kernel in 6.2.
  2. Hardware monitoring info is only available for dedicated graphics cards and only exposes power (Voltage, Power, Energy, Current)

If someone with a discreet Intel GPU could dump what is under the hwmon folder under /sys/bus/pci/devices/<pci addr>/drm/card1/hwmon/ where "pci addr" can be retrieved with a lspci | grep VGA I can at least implement power draw for these.

Syllo avatar Apr 16 '23 11:04 Syllo

$ find /sys/bus/pci/devices/0000\:03\:00.0/drm/card0/device/hwmon/ -type f
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/uevent
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power1_max_interval
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power1_max
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/energy1_input
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/in0_input
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power/runtime_active_time
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power/runtime_status
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power/autosuspend_delay_ms
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power/runtime_suspended_time
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power/control
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/power1_rated_max
/sys/bus/pci/devices/0000:03:00.0/drm/card0/device/hwmon/hwmon2/name

pepijndevos avatar Apr 16 '23 14:04 pepijndevos

I don't know if this help, I has used lsof -p <intel_gpu_top pid> too see which files are being access within intel_gpu_top, and see this image In that directory it has some files like this image

Edit: Content of some files image

K4zoku avatar Apr 16 '23 16:04 K4zoku

Hmmm for me intel_gpu_top also doesn't really show much interesting for my Arc A770 GPU though

intel-gpu-top: Intel Dg2 (Gen12) @ /dev/dri/card0
      0/   0 MHz; 100% RC6;        0 irqs/s

         ENGINES     BUSY                     MI_SEMA MI_WAIT
       Render/3D    0.00% |                 |      0%      0%
         Blitter    0.00% |                 |      0%      0%
           Video    0.00% |                 |      0%      0%
    VideoEnhance    0.00% |                 |      0%      0%
       [unknown]    0.00% |                 |      0%      0%

pepijndevos avatar Apr 16 '23 17:04 pepijndevos

Hmmm for me intel_gpu_top also doesn't really show much interesting for my Arc A770 GPU though

Because it seems nothing is using your GPU.

ich777 avatar Apr 17 '23 09:04 ich777

Of note, starting with the 6.8 kernel, Intel GPUs now expose memory usage via fdinfo. I've validated this with my embedded Intel GPU. With this in place, nvtop should be able to expose per-process GPU memory usage now.

chealy avatar Apr 06 '24 18:04 chealy

Of note, starting with the 6.8 kernel, Intel GPUs now expose memory usage via fdinfo. I've validated this with my embedded Intel GPU. With this in place, nvtop should be able to expose per-process GPU memory usage now.

May I ask how to view the memory information

cyear avatar Jul 24 '24 16:07 cyear