longhorn-manager
longhorn-manager copied to clipboard
fix: add nvme manual disk driver when auto detection failed to detect the nvme driver
Which issue(s) this PR fixes:
Issue longhorn/longhorn#11127
What this PR does / why we need it:
I had issues using NVMe drives on a talos cluster with a recent kernel. After a bit of digging I found out that the driver name for "vfio_pci" is not the same. Running the "auto-detection" driver commands manually :
nsenter --mount=/proc/1/ns/mnt --ipc=/proc/1/ns/ipc --net=/proc/1/ns/net env PCI_ALLOWED=0000:02:00.0 bash /usr/src/spdk/scripts/setup.sh
0000:01:00.0 (144d a80a): Skipping denied controller at 0000:01:00.0
0000:01:00.0 (144d a80a): Active devices: mount@nvme0n1:nvme0n1p5, so not binding PCI dev
0000:02:00.0 (144d a80a): nvme -> vfio-pci
INFO: Requested 1024 hugepages but 1024 already allocated
nsenter --mount=/proc/1/ns/mnt --ipc=/proc/1/ns/ipc --net=/proc/1/ns/net env PCI_ALLOWED=0000:02:00.0 bash /usr/src/spdk/scripts/setup.sh disk-status 0000:02:00.0
0000:01:00.0 (144d a80a): Skipping denied controller at 0000:01:00.0
0000:01:00.0 (144d a80a): Active devices: mount@nvme0n1:nvme0n1p5, so not binding PCI dev
{"bdf":"0000:02:00.0","type":"NVMe","driver":"vfio-pci","vendor":"144d","numa":"unknown","device":"-","block_devices":"-"}
uname -a
Linux instance-manager-b2105364cf7e819d6e95b009ebbcd205 6.12.31-talos #1 SMP Tue Jun 3 10:47:32 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
instance-manager-b2105364cf7e819d6e95b009ebbcd205:/ #
The issue come from the module name returned by spdk from the Talos kernel it is vfio-pci and not the excepted vfio_pci here :
https://github.com/longhorn/longhorn-spdk-engine/blob/main/pkg/spdk/disk/types.go#L56
https://github.com/longhorn/go-common-libs/blob/main/types/file.go#L20
I solved this by editing the CRD manually and allowing to manual set the "nvme" disk driver.
Special notes for your reviewer:
May not be the best solution and dosen't fix the autodetection code (Maybe checking both - and _ version in the if ?) but it offer the ability to control it manually.
Additional documentation or context
[!IMPORTANT]
Review skipped
Auto reviews are disabled on this repository.
Please check the settings in the CodeRabbit UI or the
.coderabbit.yamlfile in this repository. To trigger a single review, invoke the@coderabbitai reviewcommand.You can disable this status message by setting the
reviews.review_statustofalsein the CodeRabbit configuration file.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
🪧 Tips
Chat
There are 3 ways to chat with CodeRabbit:
- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
I pushed a fix in commit <commit_id>, please review it.Explain this complex logic.Open a follow-up GitHub issue for this discussion.
- Files and specific lines of code (under the "Files changed" tab): Tag
@coderabbitaiin a new review comment at the desired location with your query. Examples:@coderabbitai explain this code block.@coderabbitai modularize this function.
- PR comments: Tag
@coderabbitaiin a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.@coderabbitai read src/utils.ts and explain its main purpose.@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.@coderabbitai help me debug CodeRabbit configuration file.
Support
Need help? Create a ticket on our support page for assistance with any issues or questions.
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.
CodeRabbit Commands (Invoked using PR comments)
@coderabbitai pauseto pause the reviews on a PR.@coderabbitai resumeto resume the paused reviews.@coderabbitai reviewto trigger an incremental review. This is useful when automatic reviews are disabled for the repository.@coderabbitai full reviewto do a full review from scratch and review all the files again.@coderabbitai summaryto regenerate the summary of the PR.@coderabbitai generate docstringsto generate docstrings for this PR.@coderabbitai generate sequence diagramto generate a sequence diagram of the changes in this PR.@coderabbitai resolveresolve all the CodeRabbit review comments.@coderabbitai configurationto show the current CodeRabbit configuration for the repository.@coderabbitai helpto get help.
Other keywords and placeholders
- Add
@coderabbitai ignoreanywhere in the PR description to prevent this PR from being reviewed. - Add
@coderabbitai summaryto generate the high-level summary at a specific location in the PR description. - Add
@coderabbitaianywhere in the PR title to generate the title automatically.
Documentation and Community
- Visit our Documentation for detailed information on how to use CodeRabbit.
- Join our Discord Community to get help, request features, and share feedback.
- Follow us on X/Twitter for updates and announcements.
Thanks @Hugome for the contribution. Could you create a BUG ticket in https://github.com/longhorn/longhorn/issues and add the ticket number to https://github.com/longhorn/longhorn-manager/pull/3848#issue-3146102233? Thank you.
cc @c3y1huang @shuo-wu
Yes no problem, it is done :+1: Thanks
Hello @Hugome
Can you execute bash k8s/generate_code.sh and add the changed files to the PR? Thanks.
Hello @derekbit
Done and it helped fix a typo issue :+1:
Thanks
Thanks @Hugome!