On dnf5 systems (e.g. Fedora 41) python3-rpm required + error message not descriptive
If one installs linux-system-roles and then attempts to use linux-system-roles.network in a playbook, the play fails with the following error:
TASK [linux-system-roles.network : Check which packages are installed] *********
fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to
the fact that 'no_log: true' was specified for this result", "changed": false}
This bug was introduced in Fedora 41 and is probably related to the change to the dnf5 package manager in 41; linux-system-roles.network worked as expected immediately after install on Fedora 40.
To reproduce on a Fedora 41 machine, see attached minimal example playbook:
lsr.network_needs_python3-rpm.yml.txt
Specifically, the "no_log" value causing the "censored" error message is set in the following file:
/usr/share/ansible/roles/linux-system-roles.network/tasks/set_facts.yml
Commenting out "no_log: true" on the last line of that file (under "Check which packages are installed") changes the error message to a more descriptive one:
TASK [linux-system-roles.network : Check which packages are installed] *********
[WARNING]: Found "rpm" but Failed to import the required Python library (rpm) on
tdc-testvm's Python /usr/bin/python3. Please read the module documentation and
install it in the appropriate location. If the required library is installed,
but Ansible is using the wrong Python interpreter, please consult the
documentation on ansible_python_interpreter
If you install python3-rpm manually via:
dnf install python3-rpm
The playbook then executes normally and all tasks succeed.
Note that I'm filing this here instead of with Red Hat / Fedora because as any target remote hosts also need that package, it should probably be auto-installed by linux-system-roles.network when required.
This looks like https://github.com/ansible/ansible/issues/84206
I'm talking with the Ansible team to see if adding auto_install_module_deps is appropriate, or if there is some other workaround. Note that this change will affect every single code that uses package_facts everywhere not just the single network system role, so I'm trying to find a way to address this issue appropriately given the scope of the problem.
@felixhowe hm - on the other hand - fatal: [localhost]: - this is probably https://access.redhat.com/solutions/6726561 - in which case you'll need to use one of the workarounds:
Choose one of the options below to workaround the issue:
Create an inventory file that lists localhost with the ansible_connection=local option.
For example, an inventory file with:
localhost ansible_connection=local
Run ansible-playbook and specify that this inventory file should be used:
ansible-playbook -i inventory <playbook>
Create an inventory file that lists localhost.
Note that this will result in ansible-playbook connecting to the localhost over SSH with SSH key authentication, which must have previously been configured.
For example, an inventory file with:
localhost
Run ansible-playbook and specify that this inventory file should be used:
ansible-playbook -i inventory <playbook>
Use implicit localhost, with the ansible_python_interpreter variable set to use platform-python
For example:
ansible-playbook <playbook> -e 'ansible_python_interpreter=/usr/libexec/platform-python'
This looks like https://github.com/ansible/ansible/issues/84206
I think the linked issue is already fixed in the latest Fedora - I was not able to reproduce that error; the following worked:
(new, clean Fedora 41 minimal install)
dnf install ansible
cat > test.yml
- name: see whether python3-libdnf5 issue still present
hosts: localhost
tasks:
- name: update all packages
ansible.builtin.dnf:
name: '*'
state: latest
<ctrl+d>
ansible-playbook test.yml
and indeed, after running the above, the python3-libdnf5 package was present:
# dnf list --installed | grep python3-libdnf5
python3-libdnf5.x86_64 5.2.10.0-2.fc41 updates
on the other hand - fatal: [localhost]: - this is probably https://access.redhat.com/solutions/6726561 -
I don't think so - I tested the example playbook both on localhost and a remote one and got exactly the same behaviour on both, i.e. on the local host, installing python3-rpm also immediately fixes the issue.
I haven't tried this on RHEL yet though, only Fedora. Will try that this evening.
ok - thanks - we didn't see this in our testing because the images provided by our testing framework (https://packit.dev/docs/configuration/upstream/tests) have python3-rpm pre-installed - will need to figure out the best way to fix, and then how to create a regression test for this
@felixhowe on a minimal system - do you have any of these installed? python3-dnf python3-tracer python3-libdnf5 ?
The reason I'm asking - I'm writing a test to do dnf -y remove python3-rpm before the network role - this is what happens:
Package Arch Version Repository Size
Removing:
python3-rpm x86_64 4.20.0-1.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 175.3 KiB
Removing dependent packages:
python3-dnf noarch 4.22.0-2.fc41 b941836037824982ac2fb4a9202c2f17 2.6 MiB
python3-dnf-plugin-tracer noarch 4.1.2-3.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 8.8 KiB
Removing unused dependencies:
dnf-data noarch 4.22.0-2.fc41 b941836037824982ac2fb4a9202c2f17 38.6 KiB
hiredis x86_64 1.2.0-3.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 110.1 KiB
ima-evm-utils-libs x86_64 1.6.2-2.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 60.8 KiB
libcomps x86_64 0.1.21-4.fc41 b941836037824982ac2fb4a9202c2f17 206.2 KiB
libdnf x86_64 0.73.4-2.fc41 b941836037824982ac2fb4a9202c2f17 2.1 MiB
libfsverity x86_64 1.6-1.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 32.6 KiB
python3-dbus x86_64 1.3.2-8.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 520.7 KiB
python3-distro noarch 1.9.0-5.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 198.7 KiB
python3-dnf-plugins-extras-common noarch 4.1.2-3.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 96.8 KiB
python3-hawkey x86_64 0.73.4-2.fc41 b941836037824982ac2fb4a9202c2f17 297.4 KiB
python3-libcomps x86_64 0.1.21-4.fc41 b941836037824982ac2fb4a9202c2f17 140.8 KiB
python3-libdnf x86_64 0.73.4-2.fc41 b941836037824982ac2fb4a9202c2f17 3.8 MiB
python3-libdnf5 x86_64 5.2.10.0-2.fc41 b941836037824982ac2fb4a9202c2f17 8.1 MiB
python3-psutil x86_64 5.9.8-4.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 1.4 MiB
python3-six noarch 1.16.0-23.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 118.3 KiB
python3-tracer noarch 1.2-1.fc41 b941836037824982ac2fb4a9202c2f17 406.7 KiB
python3-unbound x86_64 1.22.0-14.fc41 b941836037824982ac2fb4a9202c2f17 522.9 KiB
rpm-plugin-systemd-inhibit x86_64 4.20.0-1.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 16.3 KiB
rpm-sign-libs x86_64 4.20.0-1.fc41 281c6d6fc60143e1aaaaa9f7f140cd6d 39.4 KiB
tracer-common noarch 1.2-1.fc41 b941836037824982ac2fb4a9202c2f17 33.5 KiB
unbound-anchor x86_64 1.22.0-14.fc41 b941836037824982ac2fb4a9202c2f17 57.5 KiB
unbound-libs x86_64 1.22.0-14.fc41 b941836037824982ac2fb4a9202c2f17 1.4 MiB
Transaction Summary:
Removing: 25 packages
That is an awful lot of dependencies (including nested/indirect) on python3-rpm . . . so I'm wondering if the process of creating a minimal system removes all of these? Or somehow removes python3-rpm and some others, and somehow leaves some of these?
dnf5 in Fedora is currently a bit... overly aggressive (I would argue to the point of being broken sometimes) with auto-removals. I often have to do --no-autoremove just to avoid having it remove half the system :)
There are no removals involved in the minimal install, however; see bottom for full explanation.
That said, yes, almost all of those packages are indeed absent from the minimal system's default state. Here's the full list of what's present/absent:
dnf-data absent
hiredis absent
ima-evm-utils-libs absent
libcomps absent
libdnf absent
libfsverity absent
python3-dbus present
python3-distro absent
python3-dnf absent
python3-dnf-plugin-tracer absent
python3-dnf-plugins-extras-common absent
python3-hawkey absent
python3-libcomps absent
python3-libdnf absent
python3-libdnf5 absent
python3-psutil absent
python3-six absent
python3-tracer absent
python3-unbound absent
rpm-plugin-systemd-inhibit absent
rpm-sign-libs absent
tracer-common absent
unbound-anchor absent
unbound-libs absent
(How I got this: copied the package name lines from your "would remove" output, put them in checkfor.txt on the minimal system, did s/ .*$// to it, then dnf install $(cat checkfor.txt), copied the output from that & compared, then marked matches as "absent" and things not listed as "present".)
Interestingly, I deliberately left python3-rpm out of checkfor.txt and one of the other things pulled it in as a dependency. Specifically, it looks like dnf install python3-dnf will (now?) result in python3-rpm also getting installed. (I'm not sure if that's helpful or not - only if by coincidence installing python3-dnf happens to fix some other issue and makes this one simpler, I guess.)
Lots more detail about what I mean by "minimal system":
The minimal system template I use is the result of using the current Fedora 41 netinstall ISO with the attached anaconda config (ks.cfg.gz - note that I've redacted some things, and gzipped it because apparently .cfg is somehow a "dangerous file type" in the Microsoft world), so there are no package removals involved in creating the template. Specifically, the creation process goes like this:
- Create a VM with two DVD drives
- Set the first one to the Fedora 41 Server ISO (Fedora-Server-netinst-x86_64-41-1.4.iso last time I regenerated it)
- Set the second one to an ISO containing only the ks.cfg file in its top level directory
- On boot, hit e for edit and add the option "inst.ks=cdrom" to the kernel options line
- Wait for the non-interactive install to complete, then shut down the VM and its disk becomes the new template.
(There is also a PXE version of the process that's fully automated, but the above version is easier to describe and they produce the same result.)
In theory, you would get exactly the same package selections if you:
- Create a blank VM with one DVD drive
- Attach the Fedora 41 ISO
- In the "Software Selection" step, choose only "Fedora Custom Operating System"
When an image is deployed from the template, it then gets configured by Ansible (either by remote, or by installing ansible on the new machine itself and pulling the relevant playbooks for use in localhost mode, depending where it's going).
In case it's useful, I've attached two outputs of dnf list --installed - one from the template's initial state (which lags behind what you'd get with a brand new install, because it only gets regenerated every month or so) and one after a dnf upgrade.
@felixhowe This is excellent - thanks! I would say though that this is an Ansible problem, not a system roles problem, since this issue affects every Ansible user wanting to use package_facts or package on a minimal dnf5 system. Ansible should add python3-rpm and the other packages to their documentation of the minimal required software on a managed node. There has to be some minimum list of packages required to be installed (and configured) on managed nodes in order for Ansible to operate. For example, sshd must be installed and configured to allow the Ansible user to use ssh public key auth - sudo must be installed and configured to allow become access - etc. If python3-rpm is not listed as such in the Ansible documentation for managed node provisioning, that seems like an Ansible issue.
That being said - we can have the system roles install the missing packages, but it doesn't seem like a function of system roles to install missing Ansible dependencies, unless the issue only affects system roles.
@richm , I've had some time to try this out on a few more systems/distros, and I now mostly agree (though maybe conditionally - see very bottom :) ) -
I would say though that this is an Ansible problem, not a system roles problem, since this issue affects every Ansible user wanting to use package_facts or package
I didn't realise until exploring this further that I/we were just "getting lucky" on older Red Hat & Fedora systems, in that package_facts is not expected to automatically install its own dependencies on targeted systems (and the documentation even implies that it won't, but could be more complete about this). It just happened that on older dnf/yum, python3-rpm was considered a dependency of dnf itself, so it was always there - even with my minimal template, on the Fedora 40 version it's there "out of the box" and difficult/impossible to remove:
# dnf remove python3-rpm
Error:
Problem: The operation would result in removing the following protected packages: dnf
If python3-rpm is not listed as such in the Ansible documentation for managed node provisioning, that seems like an Ansible issue.
Agreed - I think this "bug" becomes a documentation issue; this page:
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/package_facts_module.html
currently has the following under the comments for the manager parameter:
Choices:
"apk": Alpine Linux package manager
"apt": For DEB based distros, python-apt package must be installed on targeted hosts
"auto" (default): Depending on strategy, will match the first or all package managers provided, in order
"dnf": Alias to rpm
(emphasis on apt line mine)
Ansible upstream needs to decide whether they want to include a similar note for python3-rpm for all the dnf/rpm-based distros, or whether they want to bug the dnf maintainers about having python3-rpm be considered a hard dependency of dnf again :)
However, regarding:
we can have the system roles install the missing packages, but it doesn't seem like a function of system roles to install missing Ansible dependencies, unless the issue only affects system roles.
one of the factors that led me to report this here first was that system roles does have precedent for installing things as needed; I took my cue from how it behaves when things like linux-system-roles.selinux are used for the first time on a new host - namely, there are tasks called "Install SELinux python3 tools" and "Install SELinux tool semanage" in that role.
So if upstream does decide to change the documentation to say " python3-rpm package must be installed on targeted hosts", would linux-system-roles respond to that by adding a "Install python3-rpm" task ahead of anything that uses package_facts? Or add python3-rpm as a dependency of its package? Or add a documentation note to the user, that the user would have to find the first time they use a module that needs it? (I haven't run into any other cases yet where linux-system-roles just errors out in a way that needs to be investigated.)
I don't know what the "right" answer is; this has just got me thinking about whether there's a consistent set of conventions around "just quietly do what's needed to accomplish the task" versus "modify the target systems as little as possible" - and where such a philosophy question should best be defined (I don't know that, either, and have several competing opinions just in my own head) :)
For completeness, a couple more/updated minimal test cases:
@richm , I've had some time to try this out on a few more systems/distros, and I now mostly agree (though maybe conditionally - see very bottom :) ) -
I would say though that this is an Ansible problem, not a system roles problem, since this issue affects every Ansible user wanting to use package_facts or package
I didn't realise until exploring this further that I/we were just "getting lucky" on older Red Hat & Fedora systems, in that
package_factsis not expected to automatically install its own dependencies on targeted systems (and the documentation even implies that it won't, but could be more complete about this). It just happened that on older dnf/yum,python3-rpmwas considered a dependency of dnf itself, so it was always there - even with my minimal template, on the Fedora 40 version it's there "out of the box" and difficult/impossible to remove:# dnf remove python3-rpm Error: Problem: The operation would result in removing the following protected packages: dnfIf python3-rpm is not listed as such in the Ansible documentation for managed node provisioning, that seems like an Ansible issue.
Agreed - I think this "bug" becomes a documentation issue; this page:
https://docs.ansible.com/ansible/latest/collections/ansible/builtin/package_facts_module.html
currently has the following under the comments for the
managerparameter:Choices: "apk": Alpine Linux package manager "apt": For DEB based distros, python-apt package must be installed on targeted hosts "auto" (default): Depending on strategy, will match the first or all package managers provided, in order "dnf": Alias to rpm
(emphasis on
aptline mine)Ansible upstream needs to decide whether they want to include a similar note for
python3-rpmfor all the dnf/rpm-based distros, or whether they want to bug the dnf maintainers about havingpython3-rpmbe considered a hard dependency of dnf again :)
Yes. I think they should change the docs to say that python3-rpm is a hard requirement for using package_facts on dnf5 systems.
However, regarding:
we can have the system roles install the missing packages, but it doesn't seem like a function of system roles to install missing Ansible dependencies, unless the issue only affects system roles.
one of the factors that led me to report this here first was that system roles does have precedent for installing things as needed; I took my cue from how it behaves when things like
linux-system-roles.selinuxare used for the first time on a new host - namely, there are tasks called "Install SELinux python3 tools" and "Install SELinux tool semanage" in that role.
That's different because we aren't trying to work around a missing dependency in Ansible itself - those tools are only needed for specific use cases of the selinux system role. Ansible itself does not need the python selinux libraries - they refactored (quite some time ago) the core code in the file and related modules to use the C libraries (which are virtually always present even on minimal systems) via the python-to-C bindings to manage SELinux policy on specified files/directories.
So if upstream does decide to change the documentation to say "
python3-rpmpackage must be installed on targeted hosts", wouldlinux-system-rolesrespond to that by adding a "Install python3-rpm" task ahead of anything that usespackage_facts? Or addpython3-rpmas a dependency of its package? Or add a documentation note to the user, that the user would have to find the first time they use a module that needs it? (I haven't run into any other cases yet wherelinux-system-rolesjust errors out in a way that needs to be investigated.)
I suppose you could make an argument that, since the system roles provides as its public API a role, and not a module, it is the responsibility of the role to ensure that all of the dependencies are present, since the user of the network role should not have to know that the role is using the package_facts module on a dnf5 system, and therefore the user must ensure the presence of the python3-rpm library on the managed nodes. And if we have to add this to the README in such a way that it makes it unambiguously clear, it is just a small step to adding this to the role code.
It just sticks in my craw that all role developers, not just linux-system-roles, will have to figure out how to make this change to any role that uses package_facts and wants to manage dnf5 systems - that this is an additional burden placed on Ansible code authors because of a shortcoming in Ansible.
In the case of system roles, for the sake of consistency and future-proofing even for the roles that do not currently use package_facts, we will probably have to make this change to all of the system roles.
OTOH, I don't know of a better place to do this. Requiring all playbook authors/users to do this is quite a bit larger burden than for a couple of hundred roles.
IMO the best place to do this would be at image provisioning time - just as you know you need to have sshd and a handful of other dependencies installed on any image you want to manage with Ansible, this would be one more package. But
- the person doing the provisioning might not be the person doing the managing
- you might have to use "stock" images and have no way to provision, and just rely on the fact that sshd and cloud-init are virtually ubiquitous
- you still need some logic somewhere to determine if the image is for a dnf5-using system, and to install python3-rpm if so - and roles are a great way to encapsulate platform logic such as that
I don't know what the "right" answer is; this has just got me thinking about whether there's a consistent set of conventions around "just quietly do what's needed to accomplish the task" versus "modify the target systems as little as possible" - and where such a philosophy question should best be defined (I don't know that, either, and have several competing opinions just in my own head) :)
I don't either - I presented a few options above, but I'm sure there are more. We're going to have to have a discussion among our team and other teams that develop Ansible roles and figure this out.
It just sticks in my craw that all role developers, not just linux-system-roles, will have to figure out how to make this change to any role that uses package_facts and wants to manage dnf5 systems - that this is an additional burden placed on Ansible code authors because of a shortcoming in Ansible.
Yes; I pretty much feel the same way about this. I suspect what will happen is this:
Red Hat will eventually move their main Enterprise distribution to dnf5. At that time, they'll run into this themselves en masse, and python3-rpm will become a dependency of dnf, and the problem will go away.
...but that will take a while, and in the mean time, all of us will have to put in "temporary" (in the "IT temporary" meaning of the word :P) workarounds, like:
In the case of system roles, for the sake of consistency and future-proofing even for the roles that do not currently use package_facts, we will probably have to make this change to all of the system roles.
And I'm likely to need to do that with a lot of our internal roles as well, peppering them with TODOs for "remove this when the situation changes". And many of those workarounds will be forgotten about by the time the situation has changed, and probably won't be cleaned up until some distant future additional change breaks something unrelated but proximate in the code ;)
Thanks for all your attention to this - honestly, I'd probably have settled for a removal of all the occurrences of no log: true and being told to figure it out myself :P