trident icon indicating copy to clipboard operation
trident copied to clipboard

MountVolume.MountDevice Failure: CSI Driver "csi.trident.netapp.io" Not Found in Registered CSI Drivers

Open sonic2825 opened this issue 9 months ago • 1 comments

Description

I’m encountering an issue with the Trident CSI driver in my Kubernetes cluster where a pod fails to mount a volume due to the CSI driver csi.trident.netapp.io not being found in the list of registered CSI drivers. Below is a summary of the environment, error details, and relevant outputs.

Environment

  • Kubernetes Version: Rancher RKE2 1.27.8
  • Trident Version: 24.02.0
  • Kubernetes Cluster: Running on nodes like pk8s1103 (IP: 10.2.201.103)
  • Trident Namespace: stna-csr-space
  • Pod Namespace: devops
  • Current Date: March 04, 2025
  • Trident Pods: All Trident-related pods in stna-csr-space are running (details below)
  • StorageClass: stna-prod
  • PersistentVolumeClaim: jenkins-atm-data (bound to volume pvc-094e163e-9615-4670-ad53-1e2af742ebfe)

Issue

The pod jenkins-atm-58c77cc6c7-t2dtg in the devops namespace is running but has recurring events indicating a failure to mount its volume. The specific error is:

MountVolume.MountDevice failed for volume "pvc-094e163e-9615-4670-ad53-1e2af742ebfe" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name csi.trident.netapp.io not found in the list of registered CSI drivers

This suggests that the Trident CSI driver (csi.trident.netapp.io) is either not properly registered or not functioning as expected on the node pk8s1103.

Details

Pod Description (jenkins-atm-58c77cc6c7-t2dtg)

Name:             jenkins-atm-58c77cc6c7-t2dtg
Namespace:        devops
Node:             pk8s1103/10.2.201.103
Start Time:       Fri, 31 Jan 2025 19:49:32 +0800
Status:           Running
IP:               10.42.223.28
Controlled By:    ReplicaSet/jenkins-atm-58c77cc6c7
Containers:
  container-0:
    Image:      oa-k8s-harbor.sha-dc01.metax-tech.com/pub/jenkins:2.462.2-lts-jdk11
    State:      Running
    Started:    Fri, 31 Jan 2025 19:49:36 +0800
    Ready:      True
Volumes:
  vol-7ymzc:
    Type:       PersistentVolumeClaim
    ClaimName:  jenkins-atm-data
Events:
  Warning  FailedMount  8m33s (x3109 over 4d21h)  kubelet  Unable to attach or mount volumes: unmounted volumes=[vol-7ymzc]...
  Warning  FailedMount  12s (x3476 over 4d21h)    kubelet  MountVolume.MountDevice failed for volume "pvc-094e163e-9615-4670-ad53-1e2af742ebfe"...

PVC Description (jenkins-atm-data)

Name:          jenkins-atm-data
Namespace:     devops
StorageClass:  stna-prod
Status:        Bound
Volume:        pvc-094e163e-9615-4670-ad53-1e2af742ebfe
Capacity:      10Gi
Access Modes:  RWO
Used By:       jenkins-atm-58c77cc6c7-t2dtg

Trident Pods (stna-csr-space namespace)

NAME                                  READY   STATUS    RESTARTS   AGE
trident-controller-66c675d957-jsqsz   6/6     Running   0          3d7h
trident-node-linux-2r6zl              2/2     Running   0          3d7h
trident-node-linux-68fvp              2/2     Running   0          3d9h
trident-node-linux-cnzsb              2/2     Running   0          3d9h
trident-node-linux-l7jtx              2/2     Running   0          3d9h
trident-node-linux-mlq8k              2/2     Running   0          3d7h
trident-node-linux-t92mj              2/2     Running   0          3d9h
trident-node-linux-tctwl              2/2     Running   0          3d9h
trident-node-linux-zr46x              2/2     Running   0          3d9h

Trident Logs

Running ./tridentctl logs -n stna-csr-space | grep 1e2af742ebfe returned no logs (nolog), indicating no specific log entries related to the volume ID 1e2af742ebfe (shortened from pvc-094e163e-9615-4670-ad53-1e2af742ebfe).

Observations

  1. The PVC jenkins-atm-data is correctly bound to the PV pvc-094e163e-9615-4670-ad53-1e2af742ebfe.
  2. All Trident pods in the stna-csr-space namespace appear healthy and running.
  3. The error suggests the CSI driver csi.trident.netapp.io is not registered on the node pk8s1103, despite the PVC being provisioned with the stna-prod StorageClass, which uses csi.trident.netapp.io.
  4. The pod has been experiencing this issue repeatedly for over 4 days (4d21h).

Request

Could you help identify why the csi.trident.netapp.io driver is not registered on the node? Is this a misconfiguration, a bug, or an issue with the Trident installation specific to Rancher RKE2 1.27.8 and Trident 24.02.0? Any troubleshooting steps or fixes would be greatly appreciated.

Steps to Reproduce

  1. Deploy a pod in the devops namespace using a PVC bound to a volume provisioned by the stna-prod StorageClass.
  2. Observe the pod events for FailedMount errors related to csi.trident.netapp.io.

Additional Info

  • Command run from: root@pk8s1109:/home/ladmin/NetappCSI/trident-installer

Thank you for your assistance!

sonic2825 avatar Mar 04 '25 09:03 sonic2825

@sonic2825

Can you upgrade to latest version of Trident 25.02 and check if you are still hitting this issue.

VinayKumarHavanur avatar Mar 06 '25 05:03 VinayKumarHavanur