OpenShift icon indicating copy to clipboard operation
OpenShift copied to clipboard

Consider including MachineHealthchecks as part of the cluster provisioning.

Open erzhan46 opened this issue 3 years ago • 2 comments

Consider including MachineHealthcheck as part of the cluster provisioning. Machine healthcheck API can help automatically remediate some issues with machines. See the following for more information: https://docs.openshift.com/container-platform/4.6/machine_management/deploying-machine-health-checks.html

erzhan46 avatar May 17 '21 14:05 erzhan46

Hi, can you provide some additional details around what action/operations you would like MachineHealthcheck to help with?

jboutaud avatar May 18 '21 14:05 jboutaud

In general - this capability will contribute to the 'Managed' nature of the ARO by introducing the 'self-healing' at Machine API level. Problem detection utilizes kubernetes node-problem-detector. Default remediation - machine (VM) deletion letting Machine API to create a new Machine. Examples: Machine is in phase 'Failed', Machine has no corresponding Node. See the following for more details: https://github.com/wking/openshift-enhancements/blob/master/enhancements/machine-api/machine-health-checking.md

erzhan46 avatar May 18 '21 14:05 erzhan46