gvisor icon indicating copy to clipboard operation
gvisor copied to clipboard

Add support for nvidia driver version 535.161.08

Open dawndrain opened this issue 1 year ago • 2 comments

Description

I want to use gVisor with my A100 GPU's. When I follow the instructions at https://gvisor.dev/docs/user_guide/gpu/ and run:

runsc nvproxy list-supported-drivers

I see:

535.161.07
550.54.14
550.54.15
535.104.12
535.129.03
535.154.05

Unfortunately, 535.161.07 is one off from 535.161.08, the version that my GPU's are currently running. If I run:

sudo runsc --nvproxy --debug --debug-log=/tmp/runsc.log run -bundle / my_container

and cat the error log I see:

I0701 16:17:05.097581  2939814 nvproxy.go:35] NVIDIA driver version: 535.161.08
W0701 16:17:05.097621  2939814 util.go:64] FATAL ERROR: creating loader: registering filesystems: registering nvproxy driver: unsupported Nvidia driver version: 535.161.08
creating loader: registering filesystems: registering nvproxy driver: unsupported Nvidia driver version: 535.161.08
unable to read from the sync descriptor: 0, error EOF

i.e. the .08 vs. 0.07 actually matters.

Presumably the solution is similar to https://github.com/google/gvisor/pull/10181/files

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

Presumably the solution is similar to https://github.com/google/gvisor/pull/10181/files

dawndrain avatar Jul 02 '24 22:07 dawndrain

Note that if the two driver versions are ABI-equivalent, you can set the --nvproxy-driver-version flag to the NVIDIA driver version that gVisor does support and it will override this version-detection code.

EtiennePerot avatar Jul 02 '24 22:07 EtiennePerot

As per https://gvisor.dev/docs/user_guide/gpu/#driver-versions, our policy is to add support only for driver versions used by COS, which is used in GKE.

We do have support for 535.161.07. Assuming no breaking changes have occurred between 535.161.07 and 535.161.08, you could try setting runsc flag --nvproxy-driver-version=535.161.07 as Etienne suggested.

ayushr2 avatar Jul 03 '24 14:07 ayushr2