xcat-core icon indicating copy to clipboard operation
xcat-core copied to clipboard

RHEL 9 support

Open Obihoernchen opened this issue 1 year ago • 16 comments

This issue tracks the current status of Enterprise Linux 9 support in xCAT. It covers RHEL9 and all EL9 distros like alma, rocky etc.

Known issues

  • [x] https://github.com/xcat2/xcat-core/pull/7444
    • [x] https://github.com/xcat2/xcat-core/issues/7453
  • [x] https://github.com/xcat2/xcat-core/pull/7447
  • [ ] initscripts dependency missing. See: https://github.com/xcat2/xcat-core/issues/7343
    • [ ] https://github.com/xcat2/xcat-core/issues/7341
  • [x] https://github.com/xcat2/xcat-core/pull/7419
  • [x] https://github.com/xcat2/xcat-core/pull/7420

Feel free to report additional issues.

Obihoernchen avatar Jan 29 '24 14:01 Obihoernchen

Hi @Obihoernchen

I know this is about RHEL9 but I would like to add that EL9 based distros are also having issues. genimage doesn't work in Rocky 9 (and I assume it doesn't work for OL and Alma either)

What I did to fix it: I had to create a symbolic link for the following files and change their names to follow the naming pattern of each distribution in their respective folders (/opt/xcat/share/xcat/netboot/<distro>, /opt/xcat/share/xcat/install/<distro>).

/opt/xcat/share/xcat/netboot/rh/compute.rhels9.x86_64.pkglist
/opt/xcat/share/xcat/netboot/rh/compute.rhels9.x86_64.postinstall
/opt/xcat/share/xcat/netboot/rh/service.rhels9.x86_64.exlist
/opt/xcat/share/xcat/netboot/rh/service.rhels9.x86_64.otherpkgs.pkglist
/opt/xcat/share/xcat/netboot/rh/service.rhels9.x86_64.pkglist
/opt/xcat/share/xcat/netboot/rh/service.rhels9.x86_64.postinstall
/opt/xcat/share/xcat/install/rh/compute.rhels9.pkglist
/opt/xcat/share/xcat/install/rh/compute.rhels9.tmpl
/opt/xcat/share/xcat/install/rh/service.rhels9.pkglist
/opt/xcat/share/xcat/install/rh/service.rhels9.tmpl
/opt/xcat/share/xcat/install/rh/service.rhels9.x86_64.otherpkgs.pkglist

lbgracioso avatar Feb 08 '24 11:02 lbgracioso

@lbgracioso that's fine. This issue is about all EL9 distros. I'll clarify this.

Good catch. I'll create the missing links. Usually I use rhels9 for all EL distros, that's why I didn't notice :D

Obihoernchen avatar Feb 08 '24 11:02 Obihoernchen

ibpostscripts.tar.gz This contains two bash scripts that we're using on RHEL9.2

  1. ipoib sets IP address for first IB interface with link. Presumes boot net is 172.20.0.0/16 and IPoIB is 172.25.0.0/16
  2. ibpkey looks for a configured IB partition PKEY # 1 and creates a subinterface, requesting IP from DHCP server on master node. We have MOFED installed, only real dependency is use of ibdev2netdev.

ddj-brown avatar Feb 24 '24 19:02 ddj-brown

@Obihoernchen ~As per the github workflow cofiguration, the PR test checks the changes in the ubuntu 20.04 environment, which may not catch issues against RHEL and derivative environments. Can the PR testing be extended to also include any one of the RHEL and derivatives environment.~ Never mind. Only ubuntu is available as a runner as per the docs, unless self hosting the runners

samveen avatar Feb 26 '24 13:02 samveen

yes unfortunately :( But the testing in the background is way bigger testing all different OSs already. And also way more tests with real OS deployment etc.

Obihoernchen avatar Feb 26 '24 13:02 Obihoernchen

Is this change only going to be for RHEL 9.x+ or earlier versions as well?

ZAM- avatar Mar 19 '24 17:03 ZAM-

@ZAM- RHEL/Centos 7 family and RHEL/Centos/Rocky 8 family are already very well supported. This ticket is to track extending that support to include the RHEL9 and derivatives.

samveen avatar Mar 20 '24 09:03 samveen

@Obihoernchen Hi Markus, one question we have, and I think @ZAM- was alluding to, is how we approach fixes related to networking. By that I mean, a lot of the tools in xCAT were written to use the now deprecated ifcg commands and directory structure, e.g. ifcfg files vs. keyfiles. My assumption is that we’re not replacing the ifcfg nomenclature but rather including the keyfile nomenclature. I also assume that we would be doing a check to see if the host is RHEL9. Do you have an example of how you have addressed this already if you have?

rlcto avatar Mar 21 '24 17:03 rlcto

@rlcto @ZAM- Yes, often xCAT is using nmcli already, but in some parts (for instance nics.nicextraparams and I think IPv6) it relies on ifcfg files. In my opinion it would be best to use nmcli for everything and not rely on files at all. To keep it simple, I guess current logic for EL<8 can stay as is, just add a special case for EL>=9 with full nmcli support if needed. So far 90% is working with current code already (on EL9), just some minor parts are not using nmcli yet or use nmcli, but still rely on ifcfg files, too. OS detection logic is included in networks scritps already. For instance: https://github.com/xcat2/xcat-core/blob/master/xCAT/postscripts/configeth#L706 Perhaps it is also easier to simply recreate the logic currently applied to the ifcfg files for the new keyfiles. But I would assume using nmcli should be easier.

Obihoernchen avatar Mar 21 '24 18:03 Obihoernchen

@Obihoernchen in what you're envisioning as far as using nmcli for EL9, would the idea be that the user-provided settings would follow the current syntax/setting names/etc., or would the idea be that the user configs would directly specify nmcli arguments?

alexrichert avatar Apr 09 '24 23:04 alexrichert

@alexrichert In my opinion everything in the nics table should stay as is. 90% of it is using nmcli already. But there are some expections. For instance nics.nicextraparams relies on ifcfg files. For this nics column users should just be able to input any nmcli connection key-value pair into this field and the network scripts should just pass this 1:1 to nmcli I guess. Furthermore even when using nmcli the scripts still write ifcfg files. This shouldn't happen on EL9+.

There might be other instances not using nmcli yet though. For now I only know about nics.nicextraparams.

Obihoernchen avatar Apr 10 '24 23:04 Obihoernchen

If you want to render something as a file or amend a file you could consider using keyfile format output before issuing a nmcli con reload command to make it active on a managed host. The use of a keyfile will help with extraparams logic but it will, as with ifcfg, rely on sane user input.

Network manager has been recommended by Red Hat since the introduction of RHEL8 and should, in my opinion, be the go to network management utility in EL8+ but I know some are familiar and reluctant to move away from ifcfg.

  1. https://www.redhat.com/en/blog/rhel-9-networking-say-goodbye-ifcfg-files-and-hello-keyfiles
  2. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_networking/assembly_networkmanager-connection-profiles-in-keyfile-format_configuring-and-managing-networking#proc_manually-creating-a-networkmanager-profile-in-keyfile-format_assembly_networkmanager-connection-profiles-in-keyfile-format

ocfmatt avatar Apr 11 '24 07:04 ocfmatt

Is installing RHEL 9 already possible using xcat? Or do these issues still prevent an automated installation? We are still on CentOS 7 and want to switch to RockyLinux 9, but are waiting for xcat support. Thanks a lot!

LukeLR avatar Jul 17 '24 11:07 LukeLR

Yes it's already possible with 2.16.5, but you might hit one of the confignetwork related issues mentioned above. But overall stateless/stateful installation is working just fine.

Obihoernchen avatar Jul 17 '24 12:07 Obihoernchen

We have our cluster of 400+ nodes booted off rhel9 image from xcat server.  We replaced some of the postscripts to deal with NetworkManager, etc. There was a bit of a learning curve but nothing insurmountable. — setting static address on boot nic after getting dhcp— setting ipoib address based on boot nic address— adding a separate IB fabric key and requesting dhcp address on that subnet. I’m willing to share the scripts, let me know.   -- ddjDave @.*** Jul 17, 2024, at 7:53 AM, L @.***> wrote: Is installing RHEL 9 already possible using xcat? Or do these issues still prevent an automated installation? We are still on CentOS 7 and want to switch to RockyLinux 9, but are waiting for xcat support. Thanks a lot!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

ddj-brown avatar Jul 17 '24 12:07 ddj-brown