tectonic-installer icon indicating copy to clipboard operation
tectonic-installer copied to clipboard

Fail to attach volume

Open jbucar opened this issue 7 years ago • 3 comments

What keywords did you search in tectonic-installer issues before filing this one?

pvc azure attach provisioning

If you have found any duplicates, you should instead reply there and close this page.

If you have not found any duplicates, delete this section and continue on.

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT or FEATURE REQUEST

BUG REPORT

Versions

  • Tectonic version (release or commit hash): 1.7.5
  • Terraform version (terraform version): 0.10
  • Platform (aws|azure|openstack|metal): azure

What happened?

Today, all our resources that have a dynamic volume (azure-disk) are in CrashLoopBackOff status. The events show:

AttachVolume.Attach failed for volume "XXXXX" : failed to get azure instance id for node "YYYY-worker-N"

Anything else we need to know?

The terraform.tfvars are tectonic_azure_etcd_vm_size = "Standard_D2s_v3" tectonic_azure_location = "westus2" tectonic_azure_master_vm_size = "Standard_D2s_v3" tectonic_azure_worker_storage_type = "Premium_LRS" tectonic_azure_worker_vm_size = "Standard_D8s_v3" tectonic_calico_network_policy = false tectonic_cl_channel = "stable" tectonic_cluster_cidr = "10.2.0.0/16" tectonic_etcd_count = "0" tectonic_experimental = true tectonic_master_count = "3" tectonic_service_cidr = "10.3.0.0/16" tectonic_vanilla_k8s = false tectonic_worker_count = "5"

jbucar avatar Nov 09 '17 15:11 jbucar

More info about this bug. We have to restart all masters and now we get the following error:

AttachVolume.Attach failed for volume "pvc-XXXX" : Attach volume "kubernetes-dynamic-pvc-XXXX" to instance "YYYY-worker-N" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="ConflictingUserInput" Message="Disk '/subscriptions/SSSS/resourceGroups/GGGG/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-XXXX' cannot be attached as the disk is already owned by VM '/subscriptions/SSSS/resourceGroups/GGGG/providers/Microsoft.Compute/virtualMachines/YYYYY-worker-N'."

I can fix this error using: az vm disk detach -g GGGG --vm-name YYYY-worker-N --name kubernetes-dynamic-pvc-XXXX

This error is very frequently

jbucar avatar Nov 09 '17 20:11 jbucar

The log from kubelet:

Nov 17 20:48:05 WORKER_X kubelet-wrapper[872]: E1117 20:48:05.235079 872 nestedpendingoperations.go:262] Operation for ""kubernetes.io/azure-disk//subscriptions/SUBSCRIPTION_XXX/resourceGroups/RESOURCEGROUP_XXX/providers/Microsoft.Compute/disks/DISK_XXXX"" failed. No retries permitted until 2017-11-17 20:50:07.235056168 +0000 UTC (durationBeforeRetry 2m2s). Error: Volume not attached according to node status for volume "DISK_XXXX" (UniqueName: "kubernetes.io/azure-disk//subscriptions/SUBSCRIPTION_XXX/resourceGroups/RESOURCEGROUP_XXX/providers/Microsoft.Compute/disks/DISK_XXXX") pod "YYYY-2" (UID: "10fcd16c-cbd7-11e7-9461-000d3afd5b78")

jbucar avatar Nov 21 '17 17:11 jbucar

Also happens to us.

ghost avatar Mar 12 '18 12:03 ghost