machine-controller-manager
machine-controller-manager copied to clipboard
Check instance reachable status in machine-controller-manager while checking new machine joining machine deployment
How to categorize this issue? /area control-plane /kind enhancement /priority 3
What would you like to be added:
in AWS, sometimes instance is running but not reachable, in aws there's a command to check this reachable status
aws ec2 describe-instance-status --instance-ids i-01e71990bfe658adc
aws ec2 describe-instance-status --instance-ids i-01e71990bfe658adc
{
"InstanceStatuses": [
{
"AvailabilityZone": "eu-central-1a",
"InstanceId": "i-01e71990bfe658adc",
"InstanceState": {
"Code": 16,
"Name": "running"
},
"InstanceStatus": {
"Details": [
{
"ImpairedSince": "2022-06-21T06:28:00+00:00",
"Name": "reachability",
"Status": "failed"
}
],
"Status": "impaired"
},
"SystemStatus": {
"Details": [
{
"Name": "reachability",
"Status": "passed"
}
],
"Status": "ok"
}
}
]
}
this instance is running but not reachable
Is it possible to add some check in MCM whether the instance is reachable?
Why is this needed:
To have better understanding what's the process of machine joining the cluster, e.g. sometime machine created, after 20mins, deleted by MCM and recreated another one....
CC @dguendisch
@neo-liang-sap Label area/todo does not exist.
Yes we will work on adding such feature. Some research is required first to see if other providers also provide such networking info of an instance directly or not.
Post Grooming discussion
We need to enhance driver method GetMachineStatus to also do some checks like reachability mentioned above, and enahance GetMachineStatusResponse to contain the result of the check.
Then we should update the error in machine status to reflect that, so that it goes till the status of higher level controllers and get reflected in dashboard for user to see.