ubuntu-drivers-common icon indicating copy to clipboard operation
ubuntu-drivers-common copied to clipboard

Insert a hook to check the allowing list to force install nvidia

Open os369510 opened this issue 2 years ago • 13 comments

To fix https://github.com/tseliot/ubuntu-drivers-common/issues/56

This is a first version and waiting for verification. It works good in ubuntu-drivers debug but I need to have a build on different systems.

Tried three scenario on a focal I+N laptop.

  1. the force install list contains a older version
    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "minimumbranch": "470",
      "features": [
        "runtimepm"
      ]
    }

the output

$ ubuntu-drivers debug
...
Found a specific nv driver version 470 for TEST 24BA(0x24BA)
...
=== matching driver packages ===
nvidia-driver-470: installed: <none>   available: 470.103.01-0ubuntu2  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB
  1. the force install list contains the same version from ubuntu archive
    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "minimumbranch": "510",
      "features": [
        "runtimepm"
      ]
    }
$ ubuntu-drivers debug
...
Found a specific nv driver version 510 for TEST 24BA(0x24BA)
...
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
  1. the force install list contains the a non-exist version.
    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "minimumbranch": "580.1234",
      "features": [
        "runtimepm"
      ]
    }
ubuntu-drivers debug
...
Found a specific nv driver version 580.1234 for TEST 24BA(0x24BA)
nvidia-driver-580.1234 is not in the pool
...
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB

I will build a debug version on a PPA and asking some peoples to help to test and update here.

os369510 avatar Apr 11 '22 18:04 os369510

After commit 025cce3

The test result as following:

    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "branch": "580.1234",
      "features": [
        "runtimepm"
      ]
    }
…
Found a specific nv driver version 580 for TEST 24BA(0x24BA)
nvidia-driver-580 is not in the pool
Candidate version does not match 580 != 510
…
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 

    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "branch": "510.1234",
      "features": [
        "runtimepm"
      ]
    }
…
Found a specific nv driver version 510 for TEST 24BA(0x24BA)
Found runtimepm supports on 0x24BA.
…
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB

    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "branch": "460",
      "features": [
        "runtimepm"
      ]
    }
…
Found a specific nv driver version 460 for TEST 24BA(0x24BA)
Candidate version does not match 460 != 510
…
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-460: installed: <none>   available: 470.103.01-0ubuntu2  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation
…

    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "branch": "510",
      "features": [
        "noruntimepm"
      ]
    }
…
Found a specific nv driver version 510 for TEST 24BA(0x24BA)
…
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB

— In normal case (file not there)

…
DEBUG:root:package_get_nv_allowing_driver(): Cannot read /etc/customized_supported_gpus.json, [Errno 2] No such file or directory: '/etc/customized_supported_gpus.json'
…
=== matching driver packages ===
nvidia-driver-510: installed: 510.54-0ubuntu0.20.04.1   available: 510.60.02-0ubuntu1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000024BAsv0000103Csd000089C6bc03sc00i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB

os369510 avatar Apr 12 '22 14:04 os369510

Test PPA is here https://launchpad.net/~os369510/+archive/ubuntu/u-d-c-pr71

os369510 avatar Apr 12 '22 14:04 os369510

autopkg test failed https://launchpadlibrarian.net/596369052/buildlog_ubuntu-jammy-amd64.ubuntu-drivers-common_1%3A0.9.6.1pr71ubuntu1_BUILDING.txt.gz

os369510 avatar May 05 '22 13:05 os369510

Original code will cause autopkg test. Will it better to move the package check to hook part?

--- a/UbuntuDrivers/detect.py
+++ b/UbuntuDrivers/detect.py
@@ -239,17 +239,15 @@ def packages_for_modalias(apt_cache, modalias):
         nvamd = package_get_nv_allowing_driver("0x" + did)
         nvamdn = "nvidia-driver-%s" % nvamd
         nvamda = "pci:v000010DEd0000%s*" % did
-        bus_map[nvamda] = set([nvamdn])
+        try:
+            apt_cache[nvamdn]
+            bus_map[nvamda] = set([nvamdn])
+        except:
+            logging.debug("%s is unaviable." % nvamdn)
 
     for alias in bus_map:
         if fnmatch.fnmatchcase(modalias.lower(), alias.lower()):
             for p in bus_map[alias]:
-                try:
-                    if fnmatch.fnmatchcase(nvamda.lower(), alias.lower()):
-                        apt_cache[p]
-                except:
-                    logging.debug("%s is unavailable." % p)
-                    continue
                 pkgs.add(p)
 
     return [apt_cache[p] for p in pkgs]

xanthein avatar May 17 '22 03:05 xanthein

======================================================================
FAIL: test_max_open_file_descriptors (test_ubuntu_drivers.DetectTest)
max_open_file_descriptors
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/work-dir/ubuntu-qemu/deb-packages/ubuntu-drivers-common/tests/test_ubuntu_drivers.py", line 205, in test_max_open_file_descriptors
    self.assertEqual(soft, 8180)
AssertionError: 1048576 != 8180

======================================================================
FAIL: test_system_driver_packages_performance (test_ubuntu_drivers.DetectTest)
system_driver_packages() performance for a lot of modaliases
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/work-dir/ubuntu-qemu/deb-packages/ubuntu-drivers-common/tests/test_ubuntu_drivers.py", line 199, in test_system_driver_packages_performance
    self.assertLess(sec, target)
AssertionError: 497.440165 not less than 30.0

----------------------------------------------------------------------

Let me see how to improve it

os369510 avatar May 26 '22 08:05 os369510

======================================================================
FAIL: test_max_open_file_descriptors (test_ubuntu_drivers.DetectTest)
max_open_file_descriptors
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/work-dir/ubuntu-qemu/deb-packages/ubuntu-drivers-common/tests/test_ubuntu_drivers.py", line 205, in test_max_open_file_descriptors
    self.assertEqual(soft, 8180)
AssertionError: 1048576 != 8180

======================================================================
FAIL: test_system_driver_packages_performance (test_ubuntu_drivers.DetectTest)
system_driver_packages() performance for a lot of modaliases
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ubuntu/work-dir/ubuntu-qemu/deb-packages/ubuntu-drivers-common/tests/test_ubuntu_drivers.py", line 199, in test_system_driver_packages_performance
    self.assertLess(sec, target)
AssertionError: 497.440165 not less than 30.0

----------------------------------------------------------------------

Let me see how to improve it

Please ignore here, it's caused by my laptop with container. The build is passed on https://launchpad.net/~os369510/+archive/ubuntu/u-d-c-pr71/+packages

os369510 avatar May 27 '22 01:05 os369510

Test cases:

  1. Normal case (Stock ubuntu), without /etc/customized_supported_gpus.json in the system.
$ ubuntu-drivers debug
...
DEBUG:root:package_get_nv_allowing_driver(): unable to read /etc/customized_supported_gpus.json
DEBUG:root:_is_nv_allowing_runtimepm_supported(): unable to read /etc/customized_supported_gpus.json
...
=== matching driver packages ===
nvidia-driver-510-server: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-470-server: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-470: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB
  1. If /etc/customized_supported_gpus.json contains incorrect json field / format.
{ # tried without this line
  "chips": [ # tried without '['
    {
      "devida": "0x24BA", # tried 'devida' an extra 'a'
      "name": "TEST 24BA",
      "branch": "580.1234",
      "features": [
        "runtimepm"
      ]
    }, # tried without ','
    {
      "devid": "0x25BA",
      "name": "TEST 25BA",
      "branch": "580.1234",
      "features": [
        "runtimepm"
      ]
    }
  ]
}
$ ubuntu-drivers debug
...
DEBUG:root:package_get_nv_allowing_driver(): unexpected json detected.
DEBUG:root:_is_nv_allowing_runtimepm_supported(): unexpected json detected
...
=== matching driver packages ===
nvidia-driver-470-server: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-470: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510-server: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-510: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB
  1. /etc/customized_supported_gpus.json point to the older version than candidate.
$ ubuntu-drivers debug
...
INFO:root:Found a specific nv driver version 450 for TEST 25BA(0x25BA)
...
DEBUG:root:Candidate version does not match 450 != 510
...
=== matching driver packages ===
nvidia-driver-450: installed: <none>   available: 460.91.03-0ubuntu1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation 
nvidia-driver-470-server: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510-server: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-470: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB
  1. /etc/customized_supported_gpus.json point to the same version than candidate.
$ ubuntu-drivers debug
...
INFO:root:Found a specific nv driver version 510 for TEST 25BA(0x25BA)
INFO:root:Found runtimepm supports on 0x25BA.
...
nvidia-driver-510-server: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-470-server: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-470: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB
  1. /etc/customized_supported_gpus.json point to a non-exist version of ubuntu-archive (source list).
$ ubuntu-drivers debug
...
INFO:root:Found a specific nv driver version 378 for TEST 25BA(0x25BA)
DEBUG:root:nvidia-driver-378 is not in the package pool.
...
DEBUG:root:Candidate version does not match 378 != 510
...
nvidia-driver-470-server: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-470: installed: <none>   available: 470.129.06-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: LTSB 
nvidia-driver-510: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1 (auto-install)  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB 
nvidia-driver-510-server: installed: <none>   available: 510.73.05-0ubuntu0.22.04.1  [distro]  non-free  modalias: pci:v000010DEd000025BAsv0000103Csd000089C6bc03sc02i00  path: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0  vendor: NVIDIA Corporation  support level: PB

os369510 avatar May 27 '22 14:05 os369510

Here is a valid customized force install json file

$ cat /etc/customized_supported_gpus.json.bak 
{
  "chips": [
    {
      "devid": "0x24BA",
      "name": "TEST 24BA",
      "branch": "580.1234",
      "features": [
        "runtimepm"
      ]
    },
    {
      "devid": "0x25BA",
      "name": "TEST 25BA",
      "branch": "510",
      "features": [
        "runtimepm"
      ]
    }
  ]
}

os369510 avatar May 27 '22 14:05 os369510

Install 1:0.9.6.1pr71ubuntu4 from my PPA and the result looks expected. This branch is good to review.

As an improvement, we can have the test cases for this feature but it's not urgent. @tseliot would you please help to review this? thanks.

os369510 avatar May 30 '22 03:05 os369510

@os369510 I am a little hesitant to approve this without having at least one test for it in the test suite. This will allow me to continue working on things without accidentally breaking the feature in the future.

tseliot avatar Jun 15 '22 13:06 tseliot

@tseliot do you mean to have some autopkg test cases on this feature?

os369510 avatar Jun 15 '22 16:06 os369510

@tseliot I added a new test case for this feature which will test the scenarios as I mention in https://github.com/tseliot/ubuntu-drivers-common/pull/71#issuecomment-1139699272

Also, I add a new section of autopkgtest in README file because I met some test cases failed in my local side which doesn't relate to my changes. The issue has gone after changing to use a clean environment.

os369510 avatar Aug 31 '22 04:08 os369510

@os369510 hey, I have created pull request #76 with some changes, which I explained there. Please have a look.

tseliot avatar Sep 16 '22 14:09 tseliot

this change has been replaced by #76

os369510 avatar Jun 07 '23 12:06 os369510