gpu-operator icon indicating copy to clipboard operation
gpu-operator copied to clipboard

Add support to GPU operator for RHEL 9

Open kimminw00 opened this issue 1 year ago • 3 comments

We are requesting support to GPU operator for Red Hat Enterprise Linux (RHEL) 9. Our organization is planning to upgrade to RHEL 9(or Rocky9) and we need GPU Operator to be compatible with this version. Currently, GPU Operator only supports up to RHEL 8. We believe that supporting RHEL 9(or Rocky9) will benefit not only our organization but also other users who are planning to upgrade to the latest version of RHEL.

kimminw00 avatar Oct 14 '24 05:10 kimminw00

@kimminw00, what makes you say that only RHEL 8 is supported? Are you running vanilla Kubernetes on RHEL 9? If yes, do you expect the operator to deploy the drivers or do you install them as RPMs on RHEL 9?

fabiendupont avatar Oct 17 '24 07:10 fabiendupont

We initially believed that RHEL 9 was not supported, based on the information provided on the NVIDIA website. However, we are currently planning to upgrade our Linux version and would like to confirm compatibility. Our expectation is that the operator will handle driver deployment.

kimminw00 avatar Oct 17 '24 09:10 kimminw00

We support upstream K8s with RHEL8 only for the moment (GPU operator will deploy the driver container as well as the rest of the operands).

We intend to support RHEL9 in CY 2025 but we have not set the exact release vehicle yet (we are still sorting out the priorities of other features we need to implement).

francisguillier avatar Oct 18 '24 22:10 francisguillier

@kimminw00, what makes you say that only RHEL 8 is supported? Are you running vanilla Kubernetes on RHEL 9? If yes, do you expect the operator to deploy the drivers or do you install them as RPMs on RHEL 9?

does this answer imply that the gpu operator is working on rhel9 based distros like alma 9 and rocky 9? Can you clarify this answer a bit please!

ddesmond avatar Nov 01 '24 18:11 ddesmond

IBM Cloud needs RHEL 9 support as well. Our managed OpenShift offering supports both RHEL 8 and RHEL 9 worker nodes.

RHEL 8 support on GPU operator was in part driven by IBM Cloud requirements. References below:

  • https://github.com/NVIDIA/gpu-operator/issues/291#issuecomment-1153349213
  • https://github.com/NVIDIA/gpu-operator/issues/358

IBM Cloud will need currency support from RHEL 8 to RHEL 9.

hasueki avatar Mar 13 '25 18:03 hasueki

Is there any update on RHEL9/RKE2 support for GPU Operator? The latest documentation I see doesn't have it listed, but I see that @francisguillier stated it is slated for CY2025 but no date listed yet.

Also, has anyone been able to get GPU Operator running on RHEL9 successfully? For clarification, is an OS that is "not supported" mean that it is "not compatible" or configurable at all? Or rather that it has not been tested and you will not receive support for configuring GPU Operator on that OS.

cody-waits-lmi avatar Apr 15 '25 15:04 cody-waits-lmi

One of our teams also has a need for the GPU Operator to be RHEL9 compatible. What is the current status?

Also adding to @cody-waits-lmi 's comment. Has anyone gotten this to work on any RHEL9 compatible system?

dcontiveros-nf avatar Apr 17 '25 19:04 dcontiveros-nf

Why the Operator is supported on RHEL 9 on vm with passthrough and not yet supported for the Nvidia vGPU. Is there a time line, when it will be supported on RHEL 9 for the vGPU?

mahmoud-mahdi avatar Oct 21 '25 07:10 mahmoud-mahdi

GPU operator support for RHEL9 is available since v24.3.1 release (for all configurations: Bare Metal / VM with GPU passthrough / VM with vGPU).

francisguillier avatar Nov 14 '25 19:11 francisguillier

Hello @francisguillier Thank you for your response, but it is not stated in the documentation.

Image

is it just Documentation Problem?

mahmoud-mahdi avatar Nov 17 '25 06:11 mahmoud-mahdi

Yes @mahmoud-mahdi it is just a doc issue. We will fix it.

francisguillier avatar Nov 17 '25 19:11 francisguillier