nvtop icon indicating copy to clipboard operation
nvtop copied to clipboard

AppImage version of nvtop does not recognize GPU

Open Markus92 opened this issue 6 months ago • 4 comments

Hiya,

Downloaded the latest AppImage version of nvtop and ran as follows. Unfortunately no GPUs are detected. nvidia-smi detects them just fine.

Environment is an HPC environment running RHEL 7.9 (we have an extended support contract).

Any ideas how to further debug this issue or resolve it?

Output:

[s240394@NucleusC060 ~]$ unset CUDA_VISIBLE_DEVICES
[s240394@NucleusC060 ~]$ ./nvtop-3.2.0-x86_64.AppImage
No GPU to monitor.
[s240394@NucleusC060 ~]$ nvidia-smi
Tue Jun 24 08:28:47 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla V100-PCIE-32GB           Off |   00000000:3B:00.0 Off |                    0 |
| N/A   27C    P0             26W /  250W |     166MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A    163924      G   /usr/bin/X                                     66MiB |
|    0   N/A  N/A    164013      G   /usr/bin/gnome-shell                           98MiB |
+-----------------------------------------------------------------------------------------+

Markus92 avatar Jun 24 '25 13:06 Markus92

For NVIDIA we are relying on the libnvidia-ml.so to be discoverable by the dynamic linker.

If it is installed in a non-standard location you can try setting LD_LIBRARY_PATH to the folder where it is installed.

For example: LD_LIBRARY_PATH=/usr/local/cuda-XX/lib64 nvtop

Syllo avatar Jun 28 '25 13:06 Syllo

The LD_LIBRARY_PATH is non-standard on our system, but even setting it explicitly to include the correct folder still does not fix the problem:

[s240394@Nucleus042 ~]$ ./nvtop-3.2.0-x86_64.AppImage
No GPU to monitor.
[s240394@Nucleus042 ~]$ echo $LD_LIBRARY_PATH
/lib64:/usr/lib64:/cm/shared/apps/slurm/16.05.8/lib64/slurm:/cm/shared/apps/slurm/16.05.8/lib64
[s240394@Nucleus042 ~]$  ldconfig -p | grep nvidia-ml
        libnvidia-ml.so.1 (libc6,x86-64) => /lib64/libnvidia-ml.so.1
        libnvidia-ml.so (libc6,x86-64) => /lib64/libnvidia-ml.so
[s240394@Nucleus042 ~]$ file /lib64/libnvidia-ml.so
/lib64/libnvidia-ml.so: symbolic link to `libnvidia-ml.so.470.182.03'
[s240394@Nucleus042 ~]$ file /lib64/libnvidia-ml.so.470.182.03
/lib64/libnvidia-ml.so.470.182.03: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped

This GPU node had an old K40m GPU, but even on a newer V100 node (with newer libnvidia-ml-550) I still have the same issue.

Markus92 avatar Jun 30 '25 15:06 Markus92

Same here, 3.1 works fine, but 3.2 does not find the GPU.

ntadej avatar Aug 11 '25 05:08 ntadej

3.1 gave a slightly more meaningful error:

./nvtop-x86_64.AppImage
./nvtop-x86_64.AppImage: /lib64/libc.so.6: version `GLIBC_2.26' not found (required by ./nvtop-x86_64.AppImage)

Which is reasonable given we only have glibc 2.18 on RHEL 7.

If this is expected in 3.2 as well, we can close this issue until we upgrade our cluster to RHEL 9 (scheduled for November).

Markus92 avatar Aug 25 '25 14:08 Markus92