server icon indicating copy to clipboard operation
server copied to clipboard

can get the nvidia-smi information but cannot detect available GPU device.

Open chiehpower opened this issue 4 months ago • 1 comments

Description Now I have a laptop, the spec is quite new. Using RTX4070, 32 GB RAM memory and has 24 CPU cores. When I start the Triton server, it cannot detect the GPU device. However, when I access to that container, I can check the GPU driver information (by nvidia-smi command).

2024-02-22_13-43

Here is the CPU information 2024-02-22_13-43_1

GPU driver version is 535.154.05

Triton Information Using v23.02

Are you using the Triton container or did you build it yourself? Container.

To Reproduce Cannot reproduce.

Expected behavior Triton can detect the GPU device.

This problem seems to occur frequently on brand-new laptops. I'm wondering what other settings I should check and how I can resolve it. Your suggestions would be greatly appreciated. Thank you.

chiehpower avatar Feb 22 '24 06:02 chiehpower

Please refer to support matrix. According to support matrix, 23.02 version does not support 535 version of the driver. I belive this driver is supported in v 23.08 and later

oandreeva-nv avatar Feb 24 '24 01:02 oandreeva-nv

Thank you for your information! Here is the log from Triton v23.08 container. Seems work fine. 2024-02-27_14-12

chiehpower avatar Feb 27 '24 08:02 chiehpower

hi @oandreeva-nv

After I deployed a model, but encountered this error. And then the container was dead. 2024-02-29_08-54

Do you have any idea about it? Thank you!

chiehpower avatar Feb 29 '24 01:02 chiehpower