ubuntu-drivers-common icon indicating copy to clipboard operation
ubuntu-drivers-common copied to clipboard

A10 Suggests Invalid nvidia drivers

Open Monochromics opened this issue 1 year ago • 0 comments

Hello,

I'm still a little grey on if this is an nvidia problem or an ubuntu-drivers-common problem, but it is creating a bit of a confusing experience for end users.

Summary

When launching A10s on Azure (NVadsA10v5-series), you receive a vm with a partitioned or whole A10 gpu. The Azure docs seem to imply that you should be able to use CUDA or GRID (with the included license), however installing 535 (or any other series) will cause the instance to fail to boot (due to nvidia-persistenced). That needs to be addressed elsewhere, but I don't believe ubuntu-drivers-common should even suggest it to start with.

I did test with a 1/8th partition, as well as a whole GPU. Unfortunately, I don't have baremetal access to test there.

nvidia does seem to imply that the following modalias is supported though on the 535 cuda: pci:v000010DEd00002236sv000010DEsd00001482bc03sc*i*

This is the MODALIAS of my test instance: MODALIAS=pci:v000010DEd00002236sv000010DEsd000014BBbc03sc02i00

Reproduction

  • Launch any NVadsA10v5-series
  • Install ubuntu-drivers-common
  • ubuntu-drivers list
  • observe incompatible suggestions

If this is something that is purely with nvidia's side, feel free to close this. We're separately raising a case with them as well. If any additional information is required, please let me know. I also have sos reports avail.

Thanks!

Monochromics avatar Jan 26 '24 17:01 Monochromics