eve icon indicating copy to clipboard operation
eve copied to clipboard

Install Descheduler, fix startup readywait

Open andrewd-zededa opened this issue 1 year ago • 2 comments

This is a few changes to the cluster-init.sh install/boot path of HV=kubevirt eve as a base for upcoming cluster work.

Descheduler will be used for eve-app rebalancing during cluster node reboots/upgrades in an upcoming PR. After a node has encountered an outage and recovered the descheduler is used to evict pods where the current node does not match the preferred affinity node. Next the native kubernetes scheduler is allowed to run again and place that pod back where it has requested placement.

Longhorn daemonsets take some time to come ready (~5-10 minutes on some systems) after the initial install request with 'kubectl apply'. It is important to wait at install time and block all_components_initialized until all longhorn daemonsets are ready as a foundation before an upcoming PR to snapshot single-node /var/lib sqlite k3s db. This db snapshot is used to facilitate converting a cluster node back to a single node system.

Fix: Resolve a small window which led to a failure to import external-boot-image:

  • Wait for containerd before importing.
  • Tighter error checking on import.

andrewd-zededa avatar Oct 16 '24 17:10 andrewd-zededa

Rebased on master, addressed all review comments.

andrewd-zededa avatar Oct 18 '24 21:10 andrewd-zededa

@deitch updated PR description to add context.

andrewd-zededa avatar Oct 21 '24 14:10 andrewd-zededa

@OhmSpectator I've tried to address all your review requests, thank you for reviewing.

andrewd-zededa avatar Nov 01 '24 15:11 andrewd-zededa