longhorn-manager
longhorn-manager copied to clipboard
fix: check if environment settings are achieved
Which issue(s) this PR fixes:
Issue # longhorn/longhorn#7571
What this PR does / why we need it:
Check if the environment settings are achieved, ref: environment_check.sh
Special notes for your reviewer:
Additional documentation or context
The check is enforced when starting the Longhorn Manager daemon. If any condition fails, the Longhorn Manager will enter a loop of Starting and CrashLoopBackOff.
For users who only require RWO volumes, the NFS client is unnecessary. Then, the failure doesn't make sense.
Instead of performing checks only during startup, we can introduce conditions in nodes.longhorn.io to record the status of necessary packages, utilities, and demons. These conditions can be periodically checked by the node controller. This approach will help users stay informed and diagnose the Longhorn system effectively.
WDYT @c3y1huang @shuo-wu @PhanLe1010 @innobead
Instead of performing checks only during startup, we can introduce conditions in nodes.longhorn.io to record the status of necessary packages, utilities, and demons. These conditions can be periodically checked by the node controller. This approach will help users stay informed and diagnose the Longhorn system effectively.
I agree with this design! We already check the mountProagation and record it inside the node condition:
status:
autoEvicting: false
conditions:
- lastProbeTime: ""
lastTransitionTime: "2024-06-13T02:22:19Z"
message: ""
reason: ""
status: "True"
type: MountPropagation
We can do something similar for this new EnviromentCheck condition
Hi @derekbit, @PhanLe1010,
I have updated the PR to use the node conditions to record the environment setting. Could you review the PR when you are available?
cc @ChanYiLin for helping review it if having time. Thank you.
There is an exsting environment check for open-iscsi in the app/daemon.go. We already have a condition in node.longhorn.io for it. Do we still need to error out if open-iscsi is missing while staring up the daemon?
UPDATE: after discussing with @mantissahz , the existing environmentCheck function will be retained and called in longhorn-manager's subcommand daemon and pre-upgrade, and the function only checks the critical component open-iscsi. For other packages, utilties, and multipathd, they will be periodcally checked in the node controller.
cc @PhanLe1010 @shuo-wu @innobead WDYT?
Ther kernel check function is not easy to read and maintain. What we need to check is. the four kernels
| Distro | Broken Version |
|---|---|
| Vanilla kernel | 6.5.6 |
| Ubuntu | 5.15.0-94 |
| Ubuntu | 6.5.0-21 |
| Ubuntu | 6.5.0-1014-aws |
Can we check it use string compare?
As discussed with @mantissahz, we decide to remove kernel version check because
- The current implementation is not easy to maintain. We need to figure out a better way to do it.
- We are unable to include all problematic kernel versions in the check.
In general, LGTM.