sdubby & grubby-dummy: conflicting dependencies when installing akmod-nvidia
Qubes OS release
4.2.3
Brief summary
It is impossible to install certain nvidia dGPU drivers in fedora qubes. At least akmod-nvidia 3:560.35.03-1.fc40.x86_64 (and fc39, but EOL is coming) and xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64 are affected.
Steps to reproduce
- Get nvidia card that needs aforementioned drivers
- Install
fedora-40-xfcetemplate, update - Prepare the standalone (set your own pcidevs):
{% if grains['id'] == 'dom0' %}
nvidia-driver--create-qube:
qvm.vm:
- name: f40-standalone-nvdrv
- present:
- template: fedora-40-xfce
- label: yellow
- mem: 4000
- maxmem: 0
- vcpus: 4
- class: StandaloneVM
- prefs:
- label: yellow
- virt_mode: hvm
- kernel:
- mem: 4000
- maxmem: 0
- vcpus: 4
- class: StandaloneVM
- pcidevs: ['01:00.0','01:00.1']
- features:
- set:
- menu-items: xfce4-terminal.desktop
{% elif grains['id'] == 'f40-standalone-nvdrv' %}
nvidia-driver--enable-repo:
cmd.run:
- name: dnf config-manager --enable rpmfusion-{free,nonfree}{,-updates}
{% endif %}
- Start it up (you may have to use console, on my machine gui agent doesn't work at this stage)
- Try to install
akmod-nvidia
Expected behavior
Software installs as usual, proceed to waiting for akmod to build
Actual behavior
Last metadata expiration check: 0:46:44 ago on Sat Nov 2 20:19:16 2024.
Dependencies resolved.
=============================================================================================================
Package Arch Version Repository Size
=============================================================================================================
Installing:
akmod-nvidia x86_64 3:550.67-1.fc40 rpmfusion-nonfree 40 k
Installing dependencies:
akmods noarch 0.5.8-8.fc40 fedora 32 k
bison x86_64 3.8.2-7.fc40 fedora 1.0 M
cmake-filesystem x86_64 3.28.2-1.fc40 fedora 18 k
egl-gbm x86_64 2:1.1.2^20240919gitb24587d-3.fc40 updates 21 k
egl-wayland x86_64 1.1.17^20241016git0cd471d-3.fc40 updates 44 k
elfutils-libelf-devel x86_64 0.192-4.fc40 updates 47 k
flex x86_64 2.6.4-16.fc40 fedora 299 k
kernel-devel x86_64 6.11.5-200.fc40 updates 21 M
kernel-devel-matched x86_64 6.11.5-200.fc40 updates 183 k
kmodtool noarch 1.1-10.fc40 fedora 16 k
libgit2 x86_64 1.7.2-4.fc40 updates 543 k
libssh2 x86_64 1.11.0-4.fc40 fedora 130 k
libzstd-devel x86_64 1.5.6-1.fc40 updates 52 k
llhttp x86_64 9.2.1-1.fc40 updates 33 k
m4 x86_64 1.4.19-9.fc40 fedora 305 k
nvidia-modprobe x86_64 3:550.67-1.fc40 rpmfusion-nonfree 32 k
nvidia-settings x86_64 3:550.67-1.fc40 rpmfusion-nonfree 1.6 M
openssl x86_64 1:3.2.2-3.fc40 updates 1.1 M
openssl-devel x86_64 1:3.2.2-3.fc40 updates 2.8 M
python3-argcomplete noarch 3.5.1-1.fc40 updates 96 k
python3-babel noarch 2.16.0-1.fc40 updates 6.5 M
python3-click-plugins noarch 1.1.1-19.fc40 fedora 17 k
python3-progressbar2 noarch 3.53.2-11.fc40 fedora 72 k
python3-pygit2 x86_64 1.14.0-1.fc40 fedora 286 k
python3-rpmautospec-core noarch 0.1.5-1.fc40 updates 15 k
python3-typing-extensions noarch 4.12.2-2.fc40 updates 89 k
python3-utils noarch 3.7.0-3.fc40 fedora 69 k
rpmdevtools noarch 9.6-7.fc40 fedora 96 k
time x86_64 1.9-23.fc40 fedora 47 k
xorg-x11-drv-nvidia x86_64 3:550.67-1.fc40 rpmfusion-nonfree 126 M
xorg-x11-drv-nvidia-kmodsrc x86_64 3:550.67-1.fc40 rpmfusion-nonfree 44 M
xorg-x11-drv-nvidia-libs x86_64 3:550.67-1.fc40 rpmfusion-nonfree 59 M
zlib-ng-compat-devel x86_64 2.1.7-2.fc40 updates 38 k
Installing weak dependencies:
python3-rpmautospec noarch 0.7.3-1.fc40 updates 74 k
xorg-x11-drv-nvidia-cuda-libs x86_64 3:550.67-1.fc40 rpmfusion-nonfree 41 M
xorg-x11-drv-nvidia-power x86_64 3:550.67-1.fc40 rpmfusion-nonfree 103 k
Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
sdubby noarch 1.0-8.fc40 fedora 18 k
sdubby noarch 1.0-11.fc40 updates 19 k
Skipping packages with broken dependencies:
akmod-nvidia x86_64 3:560.35.03-1.fc40 rpmfusion-nonfree-updates 42 k
xorg-x11-drv-nvidia x86_64 3:560.35.03-5.fc40 rpmfusion-nonfree-updates 133 M
Transaction Summary
=============================================================================================================
Install 37 Packages
Skip 4 Packages
Total download size: 307 M
Installed size: 776 M
Is this ok [y/N]:
Other notes and links
It is possible to delete grubby-dummy. In that case akmod-nvidia builds and seems to be functional, but gui agent doesn't work. It may not work for the same reason it stops working after the third reproduction step and may require its own issue, I don't have this figured out yet. Keep in mind that for akmod to build you must extend /tmp/: default 1 GB is not enough. You can use my salt state to reproduce everything after the deletion of grubby-dummy.
It should be possible to install older driver version, since it doesn't have this problem with dependencies. I tried, and it only works if driver was installed before and I haven't updated yet. This most likely can be solved by rolling back the kernel on a new system. Who cares? This is old version of the driver and kernel anyway, I expect them to be deprecated at some point.
Related forum posts:
Little update, I have figured out the problems with gui daemon and deleting grubby-dummy seems to be working, but I haven't tested it that much yet.
Why is something pulling in sdubby? That is for systemd-boot and neither Fedora nor Qubes OS uses that by default.
Yes, this is very interesting question. Generally, grubby (and sdubby probably too) is rather broken concept of maintaining bootloader config, and caused several issues in the past. I'm not sure about sdubby, but grubby tries to parse generated grub.cfg and edit it to add new entries based on existing ones (contrary to the huge comment at the top to not edit it). Some upstream discussion: https://bugzilla.redhat.com/show_bug.cgi?id=1287854
</rant>
That's why we have grubby-dummy - to avoid pulling in real grubby package even if something would try. Fix for this ticket should include checking what name is pulled in via deps and adding appropriate Provides: to the grubby-dummy package (so real grubby/sdubby is no longer pulled in). I wouldn't expect anything to break, all the places using grubby I've seen do have a fallback to a proper config generator (grub2-mkconfig).
@DemiMarie both akmods (not akmods-nvidia), and xorg-x11-drv-nvidia depend on grubby, and nothing except for systemd-udev depend on sdubby directly.
$ repoquery -q --installed --whatrequires sdubby
systemd-udev-0:255.13-1.fc40.x86_64
$ repoquery -q --installed --whatrequires grubby
akmods-0:0.5.8-8.fc40.noarch
systemd-udev-0:255.13-1.fc40.x86_64
xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64
Don't know why though.
@DemiMarie both
akmods(notakmods-nvidia), andxorg-x11-drv-nvidiadepend on grubby, and nothing except forsystemd-udevdepend on sdubby directly.
That's because grubby-dummy does not provide a "dummy" binary called grubby, and what xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64 requires is actually /usr/sbin/grubby. (Old 550 version does not require this.) Both grubby and sdubby provide /usr/sbin/grubby .
$ sudo dnf repoquery --requires xorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64
/bin/sh
/usr/bin/sh
/usr/sbin/grubby
...
$ sudo dnf repoquery --whatprovides /usr/sbin/grubby
grubby-0:8.40-75.fc40.x86_64
sdubby-0:1.0-11.fc40.noarch
sdubby-0:1.0-8.fc40.noarch
$ sudo dnf repoquery --provides grubby
grubby = 8.40-75.fc40
grubby(x86-64) = 8.40-75.fc40 # provides executable binary
$ sudo dnf repoquery --provides grubby-dummy
grubby = 1000:9.0.0
grubby-dummy = 9.0.0-4.fc40 # no binary here
@DemiMarie both
akmods(notakmods-nvidia), andxorg-x11-drv-nvidiadepend on grubby, and nothing except forsystemd-udevdepend on sdubby directly.That's because grubby-dummy does not provide a "dummy" binary called
grubby, and whatxorg-x11-drv-nvidia-3:560.35.03-5.fc40.x86_64requires is actually/usr/sbin/grubby. (Old 550 version does not require this.) Bothgrubbyandsdubbyprovide/usr/sbin/grubby.
Can you work around this by uninstalling grubby-dummy? This uninstall should be harmless.
Can you work around this by uninstalling
grubby-dummy? This uninstall should be harmless.
If the package requires /usr/sbin/grubby specifically, not a package named grubby, it won't work. I'd need to check what xorg-x11-drv-nvidia is trying to do with it, but I guess providing a dummy binary (symlink to /bin/true?) in grubby-dummy package should workaround the issue.
I believe this can be resolved by building an enhanced version of the dummy package (a "Super-Dummy") that explicitly provides the binary paths and capabilities required by the driver. This prevents DNF from trying to pull in conflicting packages during updates.
I have been using this workaround for a little while now without any side effects, and it successfully survived the last Nvidia driver update.
Here is the .spec file to build the package:
cat <<EOF > super-grubby.spec
Name: grubby-dummy
Version: 99.0.0
Release: 2%{?dist}
Epoch: 1000
Summary: Super Dummy for Grubby and Sdubby
License: Public Domain
BuildArch: noarch
# Claim to provide the packages
Provides: grubby = %{version}
Provides: sdubby = %{version}
Provides: grubby-dummy = %{version}
# Claim to provide the specific binary paths (Virtual Provision)
Provides: /usr/bin/grubby
Provides: /usr/sbin/grubby
# Block the real packages
Obsoletes: grubby < %{version}
Obsoletes: sdubby < %{version}
%description
Dummy package to satisfy Nvidia driver dependencies for /usr/bin/grubby.
%build
# Nothing to build
%install
# Create only /usr/bin
mkdir -p %{buildroot}/usr/bin
# Create the dummy script
echo '#!/bin/bash' > %{buildroot}/usr/bin/grubby
echo 'echo "Dummy grubby called - doing nothing."' >> %{buildroot}/usr/bin/grubby
echo 'exit 0' >> %{buildroot}/usr/bin/grubby
# Make it executable
chmod +x %{buildroot}/usr/bin/grubby
%files
/usr/bin/grubby
EOF
- Build the Package:
rpmbuild -bb super-grubby.spec
- Install the Enhanced Dummy:
Note: You may need to remove the old grubby-dummy manually first.
sudo rpm -e --nodeps grubby-dummy
Then install the new package (this replaces the existing Qubes dummy and prevents DNF from pulling conflicts):
sudo dnf install ~/rpmbuild/RPMS/noarch/grubby-dummy-99.0.0-2.fc43.noarch.rpm -y