longhorn-manager
longhorn-manager copied to clipboard
fix(replica-scheduling): power off replica node should not rebuild new replica on same node (backport #2918)
Which issue(s) this PR fixes:
Issue longhorn/longhorn#1992
What this PR does / why we need it:
- Before creating a new Replica CR, verify the availability of suitable nodes with sufficient disk to accommodate the replica. This prevents creating Replicas that cannot be scheduled.
- During node disk candidate selection, exclude nodes where the InstanceManager is not in a running state. This prevents creating redundant Replica CRs on nodes that haven't fully recovered after a power down.
Special notes for your reviewer:
None
Additional documentation or context
None
This is an automatic backport of pull request #2918 done by [Mergify](https://mergify.com).
Ref https://github.com/longhorn/longhorn/issues/8859
TODO:
- [ ] Backport https://github.com/longhorn/longhorn-manager/pull/2939
- [ ] Backport https://github.com/longhorn/longhorn-manager/pull/2941
- [ ] Backport https://github.com/longhorn/longhorn-manager/pull/2942
- [ ] Backport https://github.com/longhorn/longhorn-manager/pull/2950