azurefile-csi-driver
azurefile-csi-driver copied to clipboard
Unable to mount Azure File Share as inline volume
What happened: Following instructions from this page: https://docs.microsoft.com/en-us/azure/aks/azure-files-volume Pods will not init because of Mount.Setup failures.
kubectl describe says:
MountVolume.SetUp failed for volume "elided" : rpc error: code = InvalidArgument desc = failed to get account name from csi-d08c45243d7cf6b57eb37e8432ba6327930301019e89b1a80bd7e329ee988a6a
Kubelet log says:
Sep 04 01:17:14 aks-nodepool1-51635614-vmss000008 kubelet[3426]: E0904 01:17:14.767450 3426 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/csi/af8c4ef0-015b-42d6-8b4f-6ff47556966e-elided podName:af8c4ef0-015b-42d6-8b4f-xxxxxxxxxxxx nodeName:}" failed. No retries permitted until 2022-09-04 01:17:46.767423942 +0000 UTC m=+3146510.790991837 (durationBeforeRetry 32s). Error: MountVolume.SetUp failed for volume "elided" (UniqueName: "kubernetes.io/csi/af8c4ef0-015b-42d6-8b4f-6ff47556966e-elided") pod "i2ksource-768d59c487-rq7zj" (UID: "af8c4ef0-015b-42d6-8b4f-6ff47556966e") : rpc error: code = InvalidArgument desc = failed to get account name from csi-51e5b168c4c31742f07e9aacd5c8be54e7728bf284fa9fa158ce6e031f750934
What you expected to happen: Expected Azure file share to mount into 3 pods that request it.
How to reproduce it: Add this to volumes array in PodSpec:
volumes:
- csi:
driver: file.csi.azure.com
readOnly: false
volumeAttributes:
secretName: mysecret-1
shareName: elided
name: myvolname
Create this secret:
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: mysecret-1
data:
azurestorageaccountkey: ---it's a secret---
azurestorageaccountname: mystorageaccountname
Anything else we need to know?: I have tried multiple permutations of options:
- Tried readOnly both false and true (didn't expect a difference, but the docs show readOnly=false).
- Tried supplying mountOptions for when readOnly is true to set file and dir mode to 0755
- Tried adding resourceGroup to volumeAttributes, in case it mattered, since it does for PV mode.
- Tried with the share in a different resource group and in the same resource group as the cluster.
- Did confirm that the share is accessible by running nc to the storage-account-name.file.core.windows.net port 445.
- Cluster is in Central US and storage is in West US. Is this an issue?
Environment:
- CSI Driver version: image: mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.20.0
- Kubernetes version (use
kubectl version): 1.22.6 - OS (e.g. from /etc/os-release): Node is Ubuntu 18.04.6 LTS
- Kernel (e.g.
uname -a): Linux aks-nodepool1-51635614-vmss000008 5.4.0-1083-azure #87~18.04.1-Ubuntu SMP Fri Jun 3 13:19:07 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux - Install tools:
- Others:
what is the pod namespace? the secret namespace should be same as pod namespace.
both are in default
pls provide driver logs on the agent node: https://github.com/kubernetes-sigs/azurefile-csi-driver/blob/master/docs/csi-debug.md#case2-volume-mountunmount-failed
Ah. Looked at the logs for one of the azurefile pods in kube-system:
E0904 01:48:03.407026 1 utils.go:81] GRPC error: rpc error: code = InvalidArgument desc = failed to get account name from csi-d08c45243d7cf6b57eb37e8432ba6327930301019e89b1a80bd7e329ee988a6a
I0904 01:48:55.147770 1 utils.go:76] GRPC call: /csi.v1.Node/NodePublishVolume
I0904 01:48:55.147790 1 utils.go:77] GRPC request: {"target_path":"/var/lib/kubelet/pods/b741a44b-9af5-4549-8456-26b86e3333b0/volumes/kubernetes.io~csi/elided/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":7}},"volume_context":{"csi.storage.k8s.io/ephemeral":"true","csi.storage.k8s.io/pod.name":"i2ksource-768d59c487-9fg4x","csi.storage.k8s.io/pod.namespace":"default","csi.storage.k8s.io/pod.uid":"b741a44b-9af5-4549-8456-26b86e3333b0","csi.storage.k8s.io/serviceAccount.name":"default","secretName":"mysecret-1","shareName":"elided"},"volume_id":"csi-967bc7df0745118486621ab441e8b015c7b0bb7fb7b283cc92cb84b847fccd2e"}
I0904 01:48:55.147876 1 nodeserver.go:68] NodePublishVolume: ephemeral volume(csi-967bc7df0745118486621ab441e8b015c7b0bb7fb7b283cc92cb84b847fccd2e) mount on /var/lib/kubelet/pods/b741a44b-9af5-4549-8456-26b86e3333b0/volumes/kubernetes.io~csi/elided/mount, VolumeContext: map[csi.storage.k8s.io/ephemeral:true csi.storage.k8s.io/pod.name:i2ksource-768d59c487-9fg4x csi.storage.k8s.io/pod.namespace:default csi.storage.k8s.io/pod.uid:b741a44b-9af5-4549-8456-26b86e3333b0 csi.storage.k8s.io/serviceAccount.name:default getaccountkeyfromsecret:true secretName:mysecret-1 secretnamespace:default shareName:elided storageaccount:]
W0904 01:48:55.147901 1 azurefile.go:575] parsing volumeID(csi-967bc7df0745118486621ab441e8b015c7b0bb7fb7b283cc92cb84b847fccd2e) return with error: error parsing volume id: "csi-967bc7df0745118486621ab441e8b015c7b0bb7fb7b283cc92cb84b847fccd2e", should at least contain two #
E0904 01:48:55.147922 1 utils.go:81] GRPC error: rpc error: code = InvalidArgument desc = failed to get account name from csi-967bc7df0745118486621ab441e8b015c7b0bb7fb7b283cc92cb84b847fccd2e
Is this a useful diagnostic?
Output from kubectl describe pod for the above azurefile pod:
Name: csi-azurefile-node-489gm
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: aks-nodepool1-51635614-vmss000008/10.224.0.5
Start Time: Mon, 08 Aug 2022 23:04:04 -0600
Labels: app=csi-azurefile-node
controller-revision-hash=5bc9876b75
pod-template-generation=3
Annotations: <none>
Status: Running
IP: 10.224.0.5
IPs:
IP: 10.224.0.5
Controlled By: DaemonSet/csi-azurefile-node
Containers:
liveness-probe:
Container ID: containerd://6588b07244d733d995b68616b6b454e9a6c81984e65532861c00861eb9182ca5
Image: mcr.microsoft.com/oss/kubernetes-csi/livenessprobe:v2.6.0
Image ID: sha256:f97059c7b56f8e771da0f91e81e495d8f722125c1342148db5970be7b4c16570
Port: <none>
Host Port: <none>
Args:
--csi-address=/csi/csi.sock
--probe-timeout=3s
--health-port=29613
--v=2
State: Running
Started: Mon, 08 Aug 2022 23:04:05 -0600
Ready: True
Restart Count: 0
Limits:
memory: 100Mi
Requests:
cpu: 10m
memory: 20Mi
Environment:
KUBERNETES_PORT_443_TCP_ADDR: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
KUBERNETES_PORT: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_SERVICE_HOST: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
Mounts:
/csi from socket-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wnccb (ro)
node-driver-registrar:
Container ID: containerd://5a533603428ae9600d0b59a2f3d18381665252dba5760b5f09ece1be9c240859
Image: mcr.microsoft.com/oss/kubernetes-csi/csi-node-driver-registrar:v2.5.0
Image ID: sha256:9f080292bc0292635062a424791004c647437fd6cb107580beeb946673d399c8
Port: <none>
Host Port: <none>
Args:
--csi-address=$(ADDRESS)
--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
--v=2
State: Running
Started: Mon, 08 Aug 2022 23:04:05 -0600
Ready: True
Restart Count: 0
Limits:
memory: 100Mi
Requests:
cpu: 10m
memory: 20Mi
Liveness: exec [/csi-node-driver-registrar --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH) --mode=kubelet-registration-probe] delay=60s timeout=30s period=10s #success=1 #failure=3
Environment:
ADDRESS: /csi/csi.sock
DRIVER_REG_SOCK_PATH: /var/lib/kubelet/plugins/file.csi.azure.com/csi.sock
KUBERNETES_PORT_443_TCP_ADDR: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
KUBERNETES_PORT: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_SERVICE_HOST: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
Mounts:
/csi from socket-dir (rw)
/registration from registration-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wnccb (ro)
azurefile:
Container ID: containerd://4fb70f32da1f88e54717f2aa4a0c7d638483affc7d528c9d0865b695f6b8b080
Image: mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi:v1.20.0
Image ID: mcr.microsoft.com/oss/kubernetes-csi/azurefile-csi@sha256:a7e8252e01acad4bb1e987ed5e873bfec1602226c36a58ebeb6e7d99133eb549
Port: 29613/TCP
Host Port: 29613/TCP
Args:
--v=2
--endpoint=$(CSI_ENDPOINT)
--nodeid=$(KUBE_NODE_NAME)
--metrics-address=0.0.0.0:29615
--enable-get-volume-stats=true
--mount-permissions=0777
State: Running
Started: Mon, 08 Aug 2022 23:04:07 -0600
Ready: True
Restart Count: 0
Limits:
memory: 400Mi
Requests:
cpu: 10m
memory: 20Mi
Liveness: http-get http://:healthz/healthz delay=30s timeout=10s period=30s #success=1 #failure=5
Environment:
AZURE_CREDENTIAL_FILE: /etc/kubernetes/azure.json
CSI_ENDPOINT: unix:///csi/csi.sock
KUBE_NODE_NAME: (v1:spec.nodeName)
KUBERNETES_PORT_443_TCP_ADDR: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
KUBERNETES_PORT: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io:443
KUBERNETES_SERVICE_HOST: dme-test-c-dme-az-rg-ec9c07-1023384a.hcp.centralus.azmk8s.io
Mounts:
/csi from socket-dir (rw)
/dev from device-dir (rw)
/etc/kubernetes/ from azure-cred (rw)
/var/lib/kubelet/ from mountpoint-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wnccb (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
socket-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins/file.csi.azure.com
HostPathType: DirectoryOrCreate
mountpoint-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/
HostPathType: DirectoryOrCreate
registration-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/plugins_registry/
HostPathType: DirectoryOrCreate
azure-cred:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/
HostPathType: DirectoryOrCreate
device-dir:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType: Directory
kube-api-access-wnccb:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute op=Exists
:NoSchedule op=Exists
CriticalAddonsOnly op=Exists
Events: <none>
Can't find any cifs or nfs mounts.
Here's an elided azure.json:
{
"cloud": "AzurePublicCloud",
"tenantId": "elided--but it is current,
"subscriptionId": "elided--but it is correct",
"aadClientId": "msi",
"aadClientSecret": "msi",
"resourceGroup": "MC_dme-az-rg_dme-test-cluster_centralus",
"location": "centralus",
"vmType": "vmss",
"subnetName": "aks-subnet",
"securityGroupName": "aks-agentpool-75853172-nsg",
"vnetName": "aks-vnet-75853172",
"vnetResourceGroup": "",
"routeTableName": "aks-agentpool-75853172-routetable",
"primaryAvailabilitySetName": "",
"primaryScaleSetName": "aks-nodepool1-51635614-vmss",
"cloudProviderBackoffMode": "v2",
"cloudProviderBackoff": true,
"cloudProviderBackoffRetries": 6,
"cloudProviderBackoffDuration": 5,
"cloudProviderRateLimit": true,
"cloudProviderRateLimitQPS": 10,
"cloudProviderRateLimitBucket": 100,
"cloudProviderRateLimitQPSWrite": 10,
"cloudProviderRateLimitBucketWrite": 100,
"useManagedIdentityExtension": true,
"userAssignedIdentityID": "2156c44d-0396-49a8-8aed-882eb8f77def",
"useInstanceMetadata": true,
"loadBalancerSku": "Standard",
"disableOutboundSNAT": false,
"excludeMasterFromStandardLB": true,
"providerVaultName": "",
"maximumLoadBalancerRuleCount": 250,
"providerKeyName": "k8s",
"providerKeyVersion": ""
}
does the pv example work? https://docs.microsoft.com/en-us/azure/aks/azure-files-volume#mount-file-share-as-a-persistent-volume
and could you provide full logs of the driver on that node? thanks.
Yes, if I convert the example to use a persistent volume and a persistent volume claim, and add the pvc to the mounts of the pods that need it, it works.
Do you need logs for the driver in this case?
Yes, if I convert the example to use a persistent volume and a persistent volume claim, and add the pvc to the mounts of the pods that need it, it works.
Do you need logs for the driver in this case?
yes, thanks.
Thanks for your help. Here's the end of the azurefile container driver log, where the PV mounting worked. I've elided the name of the share and the name of our storage account. I'll happy to send you the unedited log if it helps, through a secure channel.
I0904 05:17:42.627915 1 utils.go:76] GRPC call: /csi.v1.Node/NodeStageVolume
I0904 05:17:42.627934 1 utils.go:77] GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["dir_mode=0755","file_mode=0755","uid=1001","gid=0"]}},"access_mode":{"mode":3}},"volume_context":{"resourceGroup":"dme-az-rg","shareName":"elided"},"volume_id":"elided-vh"}
W0904 05:17:42.628038 1 azurefile.go:575] parsing volumeID(elided-vh) return with error: error parsing volume id: "elided-vh", should at least contain two #
I0904 05:17:42.628078 1 nodeserver.go:302] cifsMountPath(/var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount) fstype() volumeID(elided-vh) context(map[resourceGroup:dme-az-rg shareName:elided]) mountflags([dir_mode=0755 file_mode=0755 uid=1001 gid=0]) mountOptions([dir_mode=0755 file_mode=0755 uid=1001 gid=0 actimeo=30 mfsymlinks]) volumeMountGroup()
I0904 05:17:43.329548 1 nodeserver.go:332] volume(elided-vh) mount //mystorageaccount.file.core.windows.net/elided on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount succeeded
I0904 05:17:43.329581 1 utils.go:83] GRPC response: {}
I0904 05:17:43.339908 1 utils.go:76] GRPC call: /csi.v1.Node/NodePublishVolume
I0904 05:17:43.339924 1 utils.go:77] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount","target_path":"/var/lib/kubelet/pods/c581331b-7727-403d-9bd1-075971b99929/volumes/kubernetes.io~csi/elided-pv/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["dir_mode=0755","file_mode=0755","uid=1001","gid=0"]}},"access_mode":{"mode":3}},"volume_context":{"csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"i2kweb-6cc8ccf4bc-4qz24","csi.storage.k8s.io/pod.namespace":"default","csi.storage.k8s.io/pod.uid":"c581331b-7727-403d-9bd1-075971b99929","csi.storage.k8s.io/serviceAccount.name":"default","resourceGroup":"dme-az-rg","shareName":"elided"},"volume_id":"elided-vh"}
I0904 05:17:43.340401 1 nodeserver.go:109] NodePublishVolume: mounting /var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount at /var/lib/kubelet/pods/c581331b-7727-403d-9bd1-075971b99929/volumes/kubernetes.io~csi/elided-pv/mount with mountOptions: [bind]
I0904 05:17:43.345423 1 nodeserver.go:116] NodePublishVolume: mount /var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount at /var/lib/kubelet/pods/c581331b-7727-403d-9bd1-075971b99929/volumes/kubernetes.io~csi/elided-pv/mount successfully
I0904 05:17:43.345448 1 utils.go:83] GRPC response: {}
I0904 05:17:45.440557 1 utils.go:76] GRPC call: /csi.v1.Node/NodePublishVolume
I0904 05:17:45.440577 1 utils.go:77] GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount","target_path":"/var/lib/kubelet/pods/4aadf3a3-b2cc-4e88-b02b-3aaf1cd4cba9/volumes/kubernetes.io~csi/elided-pv/mount","volume_capability":{"AccessType":{"Mount":{"mount_flags":["dir_mode=0755","file_mode=0755","uid=1001","gid=0"]}},"access_mode":{"mode":3}},"volume_context":{"csi.storage.k8s.io/ephemeral":"false","csi.storage.k8s.io/pod.name":"i2ksource-68b4db79bf-knc9g","csi.storage.k8s.io/pod.namespace":"default","csi.storage.k8s.io/pod.uid":"4aadf3a3-b2cc-4e88-b02b-3aaf1cd4cba9","csi.storage.k8s.io/serviceAccount.name":"default","resourceGroup":"dme-az-rg","shareName":"elided"},"volume_id":"elided-vh"}
I0904 05:17:45.441159 1 nodeserver.go:109] NodePublishVolume: mounting /var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount at /var/lib/kubelet/pods/4aadf3a3-b2cc-4e88-b02b-3aaf1cd4cba9/volumes/kubernetes.io~csi/elided-pv/mount with mountOptions: [bind]
I0904 05:17:45.449300 1 nodeserver.go:116] NodePublishVolume: mount /var/lib/kubelet/plugins/kubernetes.io/csi/pv/elided-pv/globalmount at /var/lib/kubelet/pods/4aadf3a3-b2cc-4e88-b02b-3aaf1cd4cba9/volumes/kubernetes.io~csi/elided-pv/mount successfully
I0904 05:17:45.449320 1 utils.go:83] GRPC response: {}
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.