ROCK-Kernel-Driver
ROCK-Kernel-Driver copied to clipboard
Auto generated dracut configuration will break when there are multiple kernel versions available
Hi,
I am using the packages rock-dkms on Fedora-31 from official repository ( http://repo.radeon.com/rocm/yum/rpm )
The pre-build.sh installed by latest rpm package will look like below. ( it is different from the current version. that is, I couldn't find the quoted lines in this repo. but I don't know where to file this bug report )

I the above file, the last two lines are adding a dracut config file which will modify firmware path during initrd building.
This logic will break if there are multiple kernel versions available in one system.
if there are multple kernels, it will create multiple /etc/dracut.conf.d/amdgpu-<KERNEL_VERSION>.conf files and both files will be included by dracut during initrd build
as a result, the fw_path will became a non-existent path ( because it is the result of concatenation of two firmware paths )
So, the solution is simple. We only need to create a single file
/etc/dracut.conf.d/amdgpu.conf for all the kernels. The content of the file should as follows
add_drivers+=" amdgpu"
fw_dir+="/lib/firmware/$kernel"
"$kernel" will be a variable set to currntly building kernel's version number during the running time of drcut
This way, rock-dkms will not beak if there are multiple kernel versions exists in single system
Hi Harish, thank you for reporting this problem and the suggested solution. The engineer who works on our packaging script agrees with your analysis and implemented the fix you suggested. It will probably not make it into the next release (ROCm 3.5) that's too far along the process already. It should make it into ROCm 3.6.
Hi @fxkamd,
Thanks for then confirmation & update.
I respect the release cycle of this project, But, I would like to kindly draw your attention toward the end user experience due to this bug
- Most of the linux users will not delete old kernels at the time of updates. Usually every one will keep more than one kernel .
- When this bug happens, system display will not work. it will stuck at a blank screen & it will be hard to debug for normal users.
( I was able to trouble shoot this issue because, I had spare PC & I ssh'ed into the broken system from spare system )
So, I suggest to release the fix for this issue as early early as possible.
Hi Harish,
The problem you reported has been in the driver for a long time and has never come to our attention before. It not only affects the ROCm driver, but also the Linux Pro graphics driver. If the problem was common we would have expected to see bug reports about it before. We also found that your problem was not straight forward to reproduce. You didn't provide exact steps for reproducing it, but we suspect you used the DKMS command line in ways that is not typical for end users just installing our package using yum.
In our opinion the risk of a late fix outweighs the benefit in this case. If you provide steps to reproduce that are less obscure than what we came up with, maybe you'll change our mind.
Best regards, Felix
Closing off as resolved, since it's been addressed in the packaging scripts