demo icon indicating copy to clipboard operation
demo copied to clipboard

VMSS in Failed status - Referencing non existing LB. Kubernetes 1.10.4

Open theogq opened this issue 6 years ago • 8 comments

I have created a k8s cluster in Azure portal using Kubicorn. We used one VMSS for the master and a second VMSS for the nodes.

VMSS for the nodes is showing in failed status in Azure portal, with the following error message:

Resource /subscriptions/***/resourceGroups/***/providers/Microsoft.Network/loadBalancers/K8S-LOADBALANCER referenced by resource /subscriptions/***/resourceGroups/***/providers/Microsoft.Compute/virtualMachineScaleSets/k8sclustereastusdctest-node was not found. Please make sure that the referenced resource exists, and that both resources are in the same region.

There is no with name K8S-LOADBALANCER in the RG (or in any RG that I use)

The instances are in status Failed (Running). But I cannot stop, upgrade the instance because of the error message above.

I never created a LB with the specific name and I cannot find a way to remove the reference from the VMSS.

In kubectl there are two LB that were created with Helm charts `kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

lb-****-logging LoadBalancer 10.99.250.72 7171:30353/TCP,7272:32666/TCP,5140:31988/TCP 9d

****-loadbalancer LoadBalancer 10.104.217.232 80:31317/TCP 6d`

These two services show up under a LB with the name kuberetes.

Is there a way for me to remove the Reference from VMSS, so that it goes out from the failed status?

theogq avatar Nov 21 '18 17:11 theogq

could you try cli command:

az vmss update-instances -g <RESOURCE_GROUP_NAME> --name k8sclustereastusdctest-node

andyzhangx avatar Nov 22 '18 00:11 andyzhangx

If that does not work, I think need to file a ticket to azure vmss team.

andyzhangx avatar Nov 22 '18 03:11 andyzhangx

The command fail with the same error message

az vmss update-instances -g <RESOURCE_GROUP_NAME> --name k8sclustereastusdctest-node --instance-ids 1
Deployment failed. Correlation ID: db77ea36-ceaf-4cc0-a59d-47902514d35b. Resource /subscriptions/****/resourceGroups/<RESOURCE_GROUP_NAME>/providers/Microsoft.Network/loadBalancers/K8S-LOADBALANCER referenced by resource /subscriptions/****/resourceGroups/<RESOURCE_GROUP_NAME>/providers/Microsoft.Compute/virtualMachineScaleSets/k8sclustereastusdctest-node was not found. Please make sure that the referenced resource exists, and that both resources are in the same region.

I have created a ticket for the vmss, they forwarded me to k8s team and they asked me to create an issue here

theogq avatar Nov 22 '18 11:11 theogq

what about create K8S-LOADBALANCER lb manually and then check again?

andyzhangx avatar Nov 22 '18 13:11 andyzhangx

I created a new loadbalancer in Azure UI with name K8S-LOADBALANCER But I cannot reference the VMSS as backend pool. I get the following message: One basic SKU load balancer can only be associated with one virtual machine scale set at any point of time

theogq avatar Nov 22 '18 14:11 theogq

Even when I ignore this message and I select the VMSS I get the following error {"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details.","details":[{"code":"Conflict","message":"{\r\n \"status\": \"Failed\",\r\n \"error\": {\r\n \"code\": \"ResourceDeploymentFailure\",\r\n \"message\": \"The resource operation completed with terminal provisioning state 'Failed'.\",\r\n \"details\": [\r\n {\r\n \"code\": \"DeploymentFailed\",\r\n \"message\": \"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details.\",\r\n \"details\": [\r\n {\r\n \"code\": \"BadRequest\",\r\n \"message\": \"{\\r\\n \\\"error\\\": {\\r\\n \\\"details\\\": [],\\r\\n \\\"code\\\": \\\"InvalidResourceReference\\\",\\r\\n \\\"message\\\": \\\"Resource /subscriptions/SUBSCRIPTION/resourceGroups/RG/providers/Microsoft.Network/loadBalancers/k8s-loadbalancer/backendAddressPools/nfsaas-ap1 referenced by resource /subscriptions/SUBSCRIPTION/resourceGroups/RG/providers/Microsoft.Compute/virtualMachineScaleSets/k8sclustereastusdctest-node was not found. Please make sure that the referenced resource exists, and that both resources are in the same region.\\\"\\r\\n }\\r\\n}\"\r\n }\r\n ]\r\n }\r\n ]\r\n }\r\n}"}]} (Code:BadRequest)

theogq avatar Nov 22 '18 14:11 theogq

It looks like I managed to solve the issue where the VMSS was in failed status

I manually created a backendAddressPools with name nfsaas-ap1 pointing to the VMSS and that it looks like it solved my issue.

I was able to restart a VM from the VMSS and now the status for the VMSS changed to Succeeded

theogq avatar Nov 22 '18 15:11 theogq

cheers! This info is valuable.

andyzhangx avatar Nov 23 '18 05:11 andyzhangx