anoopsinghnegi
anoopsinghnegi
We also facing same issue, with SELinux **Enforcing** gpu-operator driver-daemon pod failed, it fail to install nvidia module, getting "permission deined" error, > with SELinux **disabled** OR **permissive** gpu-operator successfully...
@shivamerla - We tried the solution on our RHEL8.8 setup by referring to your private test image. You have added `chcon -t modules_object_t /usr/lib/modules/${KERNEL_VERSION}/kernel/drivers/video/` to change the files context, but...
@shivamerla, It's working, driver loaded successfully with SELinux enforcing using image `quay.io/shivamerla/driver:535.104.05-rhel8.8`, thanks for the fix.
@shivamerla - any update on this issue - even the latest version of gpu-operator v23.9.1 is failing with SELinux enforcing.
@kentrussell - thanks, below are the capture logs, many errors are showing for admgpu in dmesg logs ``` [root@amd-gpu]# dmesg | grep -i amdgpu [ 602.598995] [drm] amdgpu kernel modesetting...
@kentrussell, this VM instance is a GPU-enabled virtual machine (NGads series) from Azure. ok, we will try reinstalling the package.
it didn't work, same result after reinstallation. ``` uninstall => amdgpu-uninstall install => amdgpu-install --usecase=rocm ```
[root@amd-gpu ~]# rpm -qa|grep amdgpu-dkms amdgpu-dkms-firmware-6.3.6.60002-1718217.el8.noarch amdgpu-dkms-6.3.6.60002-1718217.el8.noarch
@nartmada - any update on this? we are blocked because of this issue.
@nartmada - we upgraded the ROCm to 6.1.0 but r**ocmi-smi** still returning "No AMD GPUs specified" [root@amd-gpu-2 ~]# rpm -qa|grep amdgpu-dkms amdgpu-dkms-6.7.0.60100-1756574.el8.noarch amdgpu-dkms-firmware-6.7.0.60100-1756574.el8.noarch [root@amd-gpu-2 ~]# rpm -qa | grep rocm...