qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

sdubby & grubby-dummy: conflicting dependencies when installing akmod-nvidia

Open RandyTheOtter opened this issue 1 year ago • 8 comments

How to file a helpful issue

Qubes OS release

4.2.3

Brief summary

It is impossible to install certain nvidia dGPU drivers in fedora qubes. At least akmod-nvidia 3:560.35.03-1.fc40.x86_64 (and fc39, but EOL is coming) and xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64 are affected.

Steps to reproduce

  1. Get nvidia card that needs aforementioned drivers
  2. Install fedora-40-xfce template, update
  3. Prepare the standalone (set your own pcidevs):
{% if grains['id'] == 'dom0' %}

nvidia-driver--create-qube:
  qvm.vm:
    - name: f40-standalone-nvdrv
    - present:
      - template: fedora-40-xfce
      - label: yellow
      - mem: 4000
      - maxmem: 0
      - vcpus: 4
      - class: StandaloneVM
    - prefs:
      - label: yellow
      - virt_mode: hvm
      - kernel:
      - mem: 4000
      - maxmem: 0
      - vcpus: 4
      - class: StandaloneVM
      - pcidevs: ['01:00.0','01:00.1']
    - features:
      - set:
        - menu-items: xfce4-terminal.desktop

{% elif grains['id'] == 'f40-standalone-nvdrv' %}

nvidia-driver--enable-repo:
  cmd.run:
    - name: dnf config-manager --enable rpmfusion-{free,nonfree}{,-updates}

{% endif %}
  1. Start it up (you may have to use console, on my machine gui agent doesn't work at this stage)
  2. Try to install akmod-nvidia

Expected behavior

Software installs as usual, proceed to waiting for akmod to build

Actual behavior

Last metadata expiration check: 0:46:44 ago on Sat Nov  2 20:19:16 2024.
Dependencies resolved.
=============================================================================================================
 Package                           Arch    Version                           Repository                  Size
=============================================================================================================
Installing:
 akmod-nvidia                      x86_64  3:550.67-1.fc40                   rpmfusion-nonfree           40 k
Installing dependencies:
 akmods                            noarch  0.5.8-8.fc40                      fedora                      32 k
 bison                             x86_64  3.8.2-7.fc40                      fedora                     1.0 M
 cmake-filesystem                  x86_64  3.28.2-1.fc40                     fedora                      18 k
 egl-gbm                           x86_64  2:1.1.2^20240919gitb24587d-3.fc40 updates                     21 k
 egl-wayland                       x86_64  1.1.17^20241016git0cd471d-3.fc40  updates                     44 k
 elfutils-libelf-devel             x86_64  0.192-4.fc40                      updates                     47 k
 flex                              x86_64  2.6.4-16.fc40                     fedora                     299 k
 kernel-devel                      x86_64  6.11.5-200.fc40                   updates                     21 M
 kernel-devel-matched              x86_64  6.11.5-200.fc40                   updates                    183 k
 kmodtool                          noarch  1.1-10.fc40                       fedora                      16 k
 libgit2                           x86_64  1.7.2-4.fc40                      updates                    543 k
 libssh2                           x86_64  1.11.0-4.fc40                     fedora                     130 k
 libzstd-devel                     x86_64  1.5.6-1.fc40                      updates                     52 k
 llhttp                            x86_64  9.2.1-1.fc40                      updates                     33 k
 m4                                x86_64  1.4.19-9.fc40                     fedora                     305 k
 nvidia-modprobe                   x86_64  3:550.67-1.fc40                   rpmfusion-nonfree           32 k
 nvidia-settings                   x86_64  3:550.67-1.fc40                   rpmfusion-nonfree          1.6 M
 openssl                           x86_64  1:3.2.2-3.fc40                    updates                    1.1 M
 openssl-devel                     x86_64  1:3.2.2-3.fc40                    updates                    2.8 M
 python3-argcomplete               noarch  3.5.1-1.fc40                      updates                     96 k
 python3-babel                     noarch  2.16.0-1.fc40                     updates                    6.5 M
 python3-click-plugins             noarch  1.1.1-19.fc40                     fedora                      17 k
 python3-progressbar2              noarch  3.53.2-11.fc40                    fedora                      72 k
 python3-pygit2                    x86_64  1.14.0-1.fc40                     fedora                     286 k
 python3-rpmautospec-core          noarch  0.1.5-1.fc40                      updates                     15 k
 python3-typing-extensions         noarch  4.12.2-2.fc40                     updates                     89 k
 python3-utils                     noarch  3.7.0-3.fc40                      fedora                      69 k
 rpmdevtools                       noarch  9.6-7.fc40                        fedora                      96 k
 time                              x86_64  1.9-23.fc40                       fedora                      47 k
 xorg-x11-drv-nvidia               x86_64  3:550.67-1.fc40                   rpmfusion-nonfree          126 M
 xorg-x11-drv-nvidia-kmodsrc       x86_64  3:550.67-1.fc40                   rpmfusion-nonfree           44 M
 xorg-x11-drv-nvidia-libs          x86_64  3:550.67-1.fc40                   rpmfusion-nonfree           59 M
 zlib-ng-compat-devel              x86_64  2.1.7-2.fc40                      updates                     38 k
Installing weak dependencies:
 python3-rpmautospec               noarch  0.7.3-1.fc40                      updates                     74 k
 xorg-x11-drv-nvidia-cuda-libs     x86_64  3:550.67-1.fc40                   rpmfusion-nonfree           41 M
 xorg-x11-drv-nvidia-power         x86_64  3:550.67-1.fc40                   rpmfusion-nonfree          103 k
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
 sdubby                            noarch  1.0-8.fc40                        fedora                      18 k
 sdubby                            noarch  1.0-11.fc40                       updates                     19 k
Skipping packages with broken dependencies:
 akmod-nvidia                      x86_64  3:560.35.03-1.fc40                rpmfusion-nonfree-updates   42 k
 xorg-x11-drv-nvidia               x86_64  3:560.35.03-5.fc40                rpmfusion-nonfree-updates  133 M

Transaction Summary
=============================================================================================================
Install  37 Packages
Skip      4 Packages

Total download size: 307 M
Installed size: 776 M
Is this ok [y/N]: 

Other notes and links

It is possible to delete grubby-dummy. In that case akmod-nvidia builds and seems to be functional, but gui agent doesn't work. It may not work for the same reason it stops working after the third reproduction step and may require its own issue, I don't have this figured out yet. Keep in mind that for akmod to build you must extend /tmp/: default 1 GB is not enough. You can use my salt state to reproduce everything after the deletion of grubby-dummy.

nvidia-driver.sls

It should be possible to install older driver version, since it doesn't have this problem with dependencies. I tried, and it only works if driver was installed before and I haven't updated yet. This most likely can be solved by rolling back the kernel on a new system. Who cares? This is old version of the driver and kernel anyway, I expect them to be deprecated at some point.

Related forum posts:

RandyTheOtter avatar Nov 02 '24 21:11 RandyTheOtter

Little update, I have figured out the problems with gui daemon and deleting grubby-dummy seems to be working, but I haven't tested it that much yet.

RandyTheOtter avatar Nov 08 '24 23:11 RandyTheOtter

Why is something pulling in sdubby? That is for systemd-boot and neither Fedora nor Qubes OS uses that by default.

DemiMarie avatar Nov 09 '24 22:11 DemiMarie

Yes, this is very interesting question. Generally, grubby (and sdubby probably too) is rather broken concept of maintaining bootloader config, and caused several issues in the past. I'm not sure about sdubby, but grubby tries to parse generated grub.cfg and edit it to add new entries based on existing ones (contrary to the huge comment at the top to not edit it). Some upstream discussion: https://bugzilla.redhat.com/show_bug.cgi?id=1287854 </rant>

That's why we have grubby-dummy - to avoid pulling in real grubby package even if something would try. Fix for this ticket should include checking what name is pulled in via deps and adding appropriate Provides: to the grubby-dummy package (so real grubby/sdubby is no longer pulled in). I wouldn't expect anything to break, all the places using grubby I've seen do have a fallback to a proper config generator (grub2-mkconfig).

marmarek avatar Nov 09 '24 22:11 marmarek

@DemiMarie both akmods (not akmods-nvidia), and xorg-x11-drv-nvidia depend on grubby, and nothing except for systemd-udev depend on sdubby directly.

 $ repoquery -q --installed --whatrequires sdubby
systemd-udev-0:255.13-1.fc40.x86_64
 $ repoquery -q --installed --whatrequires grubby
akmods-0:0.5.8-8.fc40.noarch
systemd-udev-0:255.13-1.fc40.x86_64
xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64

Don't know why though.

RandyTheOtter avatar Nov 10 '24 16:11 RandyTheOtter

@DemiMarie both akmods (not akmods-nvidia), and xorg-x11-drv-nvidia depend on grubby, and nothing except for systemd-udev depend on sdubby directly.

That's because grubby-dummy does not provide a "dummy" binary called grubby, and what xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64 requires is actually /usr/sbin/grubby. (Old 550 version does not require this.) Both grubby and sdubby provide /usr/sbin/grubby .

$ sudo dnf repoquery --requires xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64
/bin/sh
/usr/bin/sh
/usr/sbin/grubby
...

$ sudo dnf repoquery --whatprovides /usr/sbin/grubby
grubby-0:8.40-75.fc40.x86_64
sdubby-0:1.0-11.fc40.noarch
sdubby-0:1.0-8.fc40.noarch

$ sudo dnf repoquery --provides grubby
grubby = 8.40-75.fc40
grubby(x86-64) = 8.40-75.fc40 # provides executable binary

$ sudo dnf repoquery --provides grubby-dummy
grubby = 1000:9.0.0
grubby-dummy = 9.0.0-4.fc40 # no binary here

ImBearChild avatar Nov 16 '24 09:11 ImBearChild

@DemiMarie both akmods (not akmods-nvidia), and xorg-x11-drv-nvidia depend on grubby, and nothing except for systemd-udev depend on sdubby directly.

That's because grubby-dummy does not provide a "dummy" binary called grubby, and what xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64 requires is actually /usr/sbin/grubby. (Old 550 version does not require this.) Both grubby and sdubby provide /usr/sbin/grubby .

Can you work around this by uninstalling grubby-dummy? This uninstall should be harmless.

DemiMarie avatar Nov 16 '24 16:11 DemiMarie

Can you work around this by uninstalling grubby-dummy? This uninstall should be harmless.

If the package requires /usr/sbin/grubby specifically, not a package named grubby, it won't work. I'd need to check what xorg-x11-drv-nvidia is trying to do with it, but I guess providing a dummy binary (symlink to /bin/true?) in grubby-dummy package should workaround the issue.

marmarek avatar Nov 16 '24 16:11 marmarek

I believe this can be resolved by building an enhanced version of the dummy package (a "Super-Dummy") that explicitly provides the binary paths and capabilities required by the driver. This prevents DNF from trying to pull in conflicting packages during updates.

I have been using this workaround for a little while now without any side effects, and it successfully survived the last Nvidia driver update.

Here is the .spec file to build the package:

cat <<EOF > super-grubby.spec
Name:       grubby-dummy
Version:    99.0.0
Release:    2%{?dist}
Epoch:      1000
Summary:    Super Dummy for Grubby and Sdubby
License:    Public Domain
BuildArch:  noarch

# Claim to provide the packages
Provides:   grubby = %{version}
Provides:   sdubby = %{version}
Provides:   grubby-dummy = %{version}

# Claim to provide the specific binary paths (Virtual Provision)
Provides:   /usr/bin/grubby
Provides:   /usr/sbin/grubby

# Block the real packages
Obsoletes:  grubby < %{version}
Obsoletes:  sdubby < %{version}

%description
Dummy package to satisfy Nvidia driver dependencies for /usr/bin/grubby.

%build
# Nothing to build

%install
# Create only /usr/bin
mkdir -p %{buildroot}/usr/bin

# Create the dummy script
echo '#!/bin/bash' > %{buildroot}/usr/bin/grubby
echo 'echo "Dummy grubby called - doing nothing."' >> %{buildroot}/usr/bin/grubby
echo 'exit 0' >> %{buildroot}/usr/bin/grubby

# Make it executable
chmod +x %{buildroot}/usr/bin/grubby

%files
/usr/bin/grubby

EOF
  1. Build the Package:
rpmbuild -bb super-grubby.spec
  1. Install the Enhanced Dummy:

Note: You may need to remove the old grubby-dummy manually first.

sudo rpm -e --nodeps grubby-dummy

Then install the new package (this replaces the existing Qubes dummy and prevents DNF from pulling conflicts):

sudo dnf install ~/rpmbuild/RPMS/noarch/grubby-dummy-99.0.0-2.fc43.noarch.rpm -y

henrixd7 avatar Dec 10 '25 12:12 henrixd7