fix(nvmf): move /etc/nvme/host{nqn,id} requirement to hostonly
This pull request changes the 95nvmf module /etc/nvme/host{nqn,id} files inclusion.
Changes
When creating initramfs for universal boot image such as an installer, we can't include any machine-specific IDs. Let's move the check for /etc/nvme/hostnqn and /etc/nvme/hostid files presence to the hostonly section to avoid unsatisfied requirements.
Checklist
- [x] I have tested it locally
- [x] I have reviewed and updated any documentation if relevant
- [ ] ~~I am providing new code and test(s) for it~~
Cc: @mwilck @johnmeneghini @lnykryn
This is my attempt to fix the universal-image scenario. We're facing this issue specifically in the installer, may apply to any kind of pre-build images (e.g. cloud) or layered images with immutable base.
For obvious reasons we can't include any machine-specific IDs and that's where we're facing unsatisfied requirements, leading to the nvmf module not being included.
For the NBFT case where HostNQN and HostID is configured, no files are needed. For the rest of the cases we'll need to rely on nvme-cli/libnvme auto-generated IDs from e.g. DMI records. An alternative would be to generate new IDs before making first connection attempt, subject to discussion.
LGTM.
Then both files should be marked as hostonly, i.e.,
inst_simple -H <file>on the following lines:https://github.com/dracutdevs/dracut/blob/6acfecae572fb457115b276b5b64d9424ad5187b/modules.d/95nvmf/module-setup.sh#L133-L134
Done.
I haven't tested this change yet, we'll be doing extensive testing in coming weeks with different combinations of Fabrics tech and HostNQN/HostID (non)presence. For now I just wanted to have more eyes to evaluate the changes.
This causes both files to be written to the internal
/usr/lib/dracut/hostonly-fileswhen the initrd is built in hostonly mode, so these files (and all files that are only needed in a hostonly initrd) can be removed at boot by passing therd.hostonly=0option in the kernel command line (seeman dracut.cmdline.7).
That doesn't make much sense for NVMe hostnqn and hostid. Without a hostnqn/hostid, no NVMe-oF connections are possible. If the user wants to set different values, she can do so today by using rd.nvmf.hostnqn= and rd.nvmf.hostid= parameters.
On 9/22/23 09:33, Martin Wilck wrote:
That doesn't make much sense for NVMe hostnqn and hostid. Without a hostnqn/hostid, no NVMe connections are possible. If the user wants to set different values, she can do so today by using |rd.nvmf.hostnqn=| and |rd.nvmf.hostid=| parameters.
The upstream nvme-cli code today will create the hostnqn/hostid if none is provided. I think this is ideally what we want.
https://github.com/linux-nvme/nvme-cli/blob/a2587cbd2eea1d0506d052356abe94cb27cb9884/fabrics.c#L733
So as long as the runtime environment provides a dmi/product_uuid, you don't need a /etc/nvme/hostnqn\id file.
sudo cat /sys/class/dmi/id/product_uuid 66b848cc-25ac-11b2-a85c-ff377e9ee3de
Ideally we want all supported systems to be complaint with TP-4126.
From TP-4126 NVMe-oF Boot HostNQN and HostID: this is requirement for the host.
The pre-OS boot environment and OS environment should use a fixed platform UUID to create a HostNQN and HostID. The implementation should use the System UUID found in the SMBIOS table. The System Management BIOS (SMBIOS) Reference Specification is described in DSP0134. The SMBIOS table is typically available to pre-OS firmware and Expansion ROM Firmware in the pre-OS boot environment as well as to the OS environment.
I think rd.nvmf.hostnqn/hostid should be a manual override.
Right. I didn't think about the product UUID automatism. With that, it would actually be possible to clone a system and use rd.hostonly=0 for booting (assuming that the target subsystem is configured to serve namespaces to the new host's NQN, of course).
Great, I was hoping to have this kind of discussion!
More things to consider:
- not all distributions are making hostonly initramfs during installation
- we might still want to include /etc/nvme/host* files in the initramfs when available but at the same time play nice when they're missing (the universal image scenario)
- kernel 6.5 and related nvme-cli TP-4126 fixes changed how things work (considering corner cases like no DMI UUID records)
- I still want to have these three scenarios tested: NVMe/FC, NBFT with HostNQN set in the boot attempt and NBFT without HostNQN set. This will take some time.
NBFT without HostNQN set
I'm pretty certain that won't work, at least not with the current firmware.
On 9/22/23 11:06, Tomáš Bžatek wrote:
I still want to have this tested three scenarios: NVMe/FC, NBFT with HostNQN set in the boot attempt and NBFT without HostNQN set. This will take some time.
I agree that we want to test at least these three scenarios. Not sure if NBFT boot w/out HostNQN is will work with the existing code... but in theory it should be possible with v6.5...
Let's make sure we don't merge this until all of this has been tested.
This has been thoroughly tested on various system deployments for the past two months. Ready for review.
LGTM.
Rebased. Anything else missing for review?
This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.