sap_hana_preconfigure: corruption of boot
Ansible Role
sap_hana_preconfigure
OS Family
RHEL
Ansible Controller - Python version
Python 3.13.3
Ansible-core version
ansible [core 2.16.13]
Bug Description
sap_hana_preconfigure internal logic for minimum kernel patch, miscalculates when OS Images are direct from ISO (e.g. CCSP certified) or hardened.
Ansible Task Create a list of minimum required package versions to be installed will confirm against the Ansible Role's internal variables, for a list of minimum patch levels (e.g. kernel).
If the OS Image contains a later kernel patch, but this does not show as an installed package - the Ansible Role will:
- run an Ansible Task to force install the minimum kernel patch, effectively a kernel downgrade
- run the next Ansible Task which will update all OS Packages, including the kernel, and increase beyond the original patch level
- reboot and GRUB will incorrectly try to boot the wrong kernel
This is because the Ansible Task relies on command rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel within the internal logic. See examples of various OS Images where this fails:
[root@rhel86 ~]# uname -r
4.18.0-372.141.1.el8_6.x86_64
[root@rhel86 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
[root@rhel88 ~]# uname -r
4.18.0-477.94.1.el8_8.x86_64
[root@rhel88 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
[root@rhel90 ~]# uname -r
5.14.0-70.126.1.el9_0.x86_64
[root@rhel90 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
[root@rhel-9-2 ~]# uname -r
5.14.0-284.110.1.el9_2.x86_64
[root@rhel-9-2 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
[root@rhel94 ~]# uname -r
5.14.0-427.61.1.el9_4.x86_64
[root@rhel94 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
Looking closer to what the Ansible executes on the system, using RHEL 9.2 as an example:
[root@rhel-9-2 ~]# uname -r
5.14.0-284.110.1.el9_2.x86_64
[root@rhel-9-2 ~]# rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel
package kernel is not installed
[root@rhel-9-2 ~]# (echo "1 kernel-5.14.0-284.25.1.el9_2";rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel |
awk '{printf ("2 %s\n", $0)}') |
awk '{gsub ("\\.el", ".0.0"); print}' |
sort -k 2 -k 1 -V
1 kernel-5.14.0-284.25.1.0.09_2
2 package kernel is not installed
[root@rhel-9-2 ~]# (echo "1 kernel-5.14.0-284.25.1.el9_2";rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel |
awk '{printf ("2 %s\n", $0)}') |
awk '{gsub ("\\.el", ".0.0"); print}' |
sort -k 2 -k 1 -V |
awk '{gsub ("\\.0\\.0", ".el"); col1=$1; col2=$2; _nf=NF}
$1==2{latestpkg=$2}
END {
if (_nf>2) {
printf ("kernel-5.14.0-284.25.1.el9_2\n")
} else {
if (col1==1) {
printf ("kernel-5.14.0-284.25.1.el9_2\n")
}
}
}'
kernel-5.14.0-284.25.1.el9_2
In summary....
Boot 1 of OS Image for RHEL for SAP Solutions 9.2
- 5.14.0-284.
110.1.el9_2
Run Ansible Task 1 "Create a list of minimum required package versions to be installed" using variable
- 5.14.0-284.
25.1.el9_2
Run Ansible Task 2 "Ensure that the system is updated to the latest patchlevel"
- 5.14.0-284.
117.1.el9_2
Below, is the abbreviated stdout from Ansible and the matching GRUB entries:
Stage 1
TASK [community.sap_install.sap_hana_preconfigure : Create a list of minimum required package versions to be installed] ***
ok: [rhel-9-2] => (item=['kernel', '5.14.0-284.25.1.el9_2']) =>
pkg:
- kernel
- 5.14.0-284.25.1.el9_2
rc: 0
TASK [community.sap_install.sap_hana_preconfigure : Display the content of the minimum package list variable] *********
ok: [rhel-9-2] =>
__sap_hana_preconfigure_register_minpkglist:
results:
- ansible_loop_var: pkg
stdout: kernel-5.14.0-284.25.1.el9_2
skipped: false
TASK [community.sap_install.sap_hana_preconfigure : Install minimum packages if required] *****************************
results:
- 'Installed: kernel-core-5.14.0-284.25.1.el9_2.x86_64'
- 'Installed: kernel-modules-5.14.0-284.25.1.el9_2.x86_64'
- 'Installed: kernel-5.14.0-284.25.1.el9_2.x86_64'
- 'Installed: kernel-modules-core-5.14.0-284.25.1.el9_2.x86_64'
[root@rhel-9-2 ~]# grubby --info=ALL
index=0
kernel="/boot/vmlinuz-5.14.0-284.110.1.el9_2.x86_64"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-5.14.0-284.110.1.el9_2.x86_64.img"
title="Red Hat Enterprise Linux (5.14.0-284.110.1.el9_2.x86_64) 9.2 (Plow)"
id="9a8aa6d3d32c63426d70ef1043ac48ec-5.14.0-284.110.1.el9_2.x86_64"
index=1
kernel="/boot/vmlinuz-5.14.0-284.25.1.el9_2.x86_64"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-5.14.0-284.25.1.el9_2.x86_64.img"
title="Red Hat Enterprise Linux (5.14.0-284.25.1.el9_2.x86_64) 9.2 (Plow)"
id="b05a6b63e39f418ab21979742c470d27-5.14.0-284.25.1.el9_2.x86_64"
Stage 2
TASK [community.sap_install.sap_hana_preconfigure : Ensure that the system is updated to the latest patchlevel] *******
results:
- 'Installed: kernel-modules-5.14.0-284.117.1.el9_2.x86_64'
- 'Installed: python3-jinja2-2.11.3-5.el9_2.noarch'
- 'Installed: kernel-modules-core-5.14.0-284.117.1.el9_2.x86_64'
- 'Installed: libsoup-2.72.0-8.el9_2.4.x86_64'
- 'Installed: python3-perf-5.14.0-284.117.1.el9_2.x86_64'
- 'Installed: libgcrypt-1.10.0-10.el9_2.1.x86_64'
- 'Installed: webkit2gtk3-jsc-2.48.1-3.el9_2.x86_64'
- 'Installed: kernel-5.14.0-284.117.1.el9_2.x86_64'
- 'Installed: kernel-core-5.14.0-284.117.1.el9_2.x86_64'
- 'Removed: python3-jinja2-2.11.3-4.el9_2.1.noarch'
- 'Removed: webkit2gtk3-jsc-2.46.6-2.el9_2.x86_64'
- 'Removed: libsoup-2.72.0-8.el9_2.3.x86_64'
- 'Removed: python3-perf-5.14.0-284.110.1.el9_2.x86_64'
- 'Removed: libgcrypt-1.10.0-10.el9_2.x86_64'
[root@rhel-9-2 ~]# grubby --info=ALL
index=0
kernel="/boot/vmlinuz-5.14.0-284.110.1.el9_2.x86_64"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-5.14.0-284.110.1.el9_2.x86_64.img"
title="Red Hat Enterprise Linux (5.14.0-284.110.1.el9_2.x86_64) 9.2 (Plow)"
id="9a8aa6d3d32c63426d70ef1043ac48ec-5.14.0-284.110.1.el9_2.x86_64"
index=1
kernel="/boot/vmlinuz-5.14.0-284.117.1.el9_2.x86_64"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-5.14.0-284.117.1.el9_2.x86_64.img"
title="Red Hat Enterprise Linux (5.14.0-284.117.1.el9_2.x86_64) 9.2 (Plow)"
id="b05a6b63e39f418ab21979742c470d27-5.14.0-284.117.1.el9_2.x86_64"
index=2
kernel="/boot/vmlinuz-5.14.0-284.25.1.el9_2.x86_64"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0 $tuned_params"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-5.14.0-284.25.1.el9_2.x86_64.img $tuned_initrd"
title="Red Hat Enterprise Linux (5.14.0-284.25.1.el9_2.x86_64) 9.2 (Plow)"
id="b05a6b63e39f418ab21979742c470d27-5.14.0-284.25.1.el9_2.x86_64"
index=3
kernel="/boot/vmlinuz-0-rescue-b05a6b63e39f418ab21979742c470d27"
args="ro console=tty0 console=ttyS0,115200n8 no_timer_check net.ifnames=0"
root="UUID=d370e124-ea83-46ea-a7ef-67f12dd8bb3c"
initrd="/boot/initramfs-0-rescue-b05a6b63e39f418ab21979742c470d27.img"
title="Red Hat Enterprise Linux (0-rescue-b05a6b63e39f418ab21979742c470d27) 9.2 (Plow)"
id="b05a6b63e39f418ab21979742c470d27-0-rescue"
Bug reproduction
Install from ISO or a Cloud IaaS provider, use RHEL 9.2 as the lightning rod for this issue.
Community participation
Unfortunately I am not in a position to help with the bug fix
Do we want to support RHEL systems on which no package named kernel is installed, or can we use the current behavior of the role to detect such systems? If yes, is anyone aware (and can provide a link) of a documentation available which confirms that RHEL systems without the package kernel are fully supported in SAP environments?
Is there a reason why rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernel is preferred to the currently running kernel using uname -r ?
Is there a reason why
rpm -q --qf "%{NAME}-%{VERSION}-%{RELEASE}\n" kernelis preferred to the currently running kernel usinguname -r?
I think there was no specific reason but this code turned out to be fulfilling the requirements and was tested extensively. Before changing this code, which triggers additional testing, we need to find out if it is necessary to change this code. There is also an alternative solution available for SLES (which I would prefer) but again, let's first be sure that changing the code is really necessary. Worst case would be that we change the code and afterwards get informed that a RHEL for SAP system without the package kernel is unsupported.
Well, there are 2 parts to this issue:
- Current logic is good enough, but an additional check should occur to be very sure that we are not accidentally downgrading the kernel. Could be as simple as
uname -ror using the Ansible Facts.ansible_kernelvalue and trigger an error that the code suggested to replace with the minimum kernel version..... but the running kernel was higher and therefore "ERROR" emitted. This avoids the accidental downgrade of kernel. - Perhaps the
update *logic should be altered so that we do not upgrade kernel? Something like a dry-run ofupdate *, parse the packages to update list, and removekernelfrom that list?