aws-efs-csi-driver Could not start amazon-efs-mount-watchdog

trafficstars

What happened?

I've two regions: main us-east-1 and replica us-west-1 (Between regions I've configured VPC peering) EFS was deployed to the us-east-1 region EFS has two inbound rules for 2049 port for 172.240.0.0/16 (vpc from the main region) and 172.241.0.0/16 (vpc from the replica region)

EKS in us-east-1 was connected to EFS - everything is OK. EKS in us-west-1 can't connect to EFS - efs-csi-nodes shows:

Mounting command: mount
Mounting arguments: -t efs -o tls fs-029eae49f385be828:/test /var/lib/kubelet/pods/ac6e24ee-46bf-41ba-31a8-b4cb3ad211da/volumes/kubernetes.io~csi/efs-tm2tb/mount
Output: Could not start amazon-efs-mount-watchdog, unrecognized init system "aws-efs-csi-dri"
b'mount.nfs4: mount system call failed'

efs-csi-nodes have:

node:
  hostAliases:
    "fs-029eae49f385be828":
      ip: 172.240.116.107
      region: us-west-1

Pod which is trying to connect to VPC

Unable to attach or mount volumes: unmounted volumes=[persistent-storage-efs], unattached volumes=[persistent-storage-efs cachedir nltk-data kube-api-access-wzdv8]: timed out waiting for the condition

BTW, I've spun up pod with ubuntu and successfully connected to the EFS through, so, replica EKS has access to the EFS

mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 172.240.116.107:/ efs

What you expected to happen?

I expected my deployment from the replica region will be connected to the EFS.

Anything else we need to know?:

StorageClass config

storageClasses: 
- name: efs-sc
  parameters:
    provisioningMode: efs-ap
  reclaimPolicy: Delete
  volumeBindingMode: Immediate

Environment

Kubernetes version (use kubectl version): 1.21
Driver version: 2.2.4

Sep 05 '22 14:09 OuFinx

I checked again and got the same result What am I doing wrong?

Sep 13 '22 07:09 OuFinx

I have the same problem. I followed the installation instructions 1, 2 Deployed pod and it failed to connect with the same error. True, in my case, everything is in the same region and in the same VPS. I use eks with managed nodes. Deployed by helm chart 2.3.2 version

Nov 17 '22 19:11 tropnikovvl

I got the same error message.
If fixed it by adding following permissions to the IAM role and its policy which is assumed by efs node pods:

elasticfilesystem:ClientMount
elasticfilesystem:ClientWrite
elasticfilesystem:ClientRootAccess

Dec 15 '22 11:12 nichoio

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Mar 22 '23 18:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Apr 21 '23 18:04 k8s-triage-robot

/kind support

May 15 '23 14:05 RyanStan

Hi @OuFinx, are you still facing the issue?

May 18 '23 13:05 mskanth972

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Jun 17 '23 14:06 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jun 17 '23 14:06 k8s-ci-robot

/reopen

Oct 02 '23 14:10 sdrabblescripta

@sdrabblescripta: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Oct 02 '23 14:10 k8s-ci-robot

/remove-lifecycle rotten

Oct 02 '23 14:10 sdrabblescripta

aws-efs-csi-driver aws-efs-csi-driver copied to clipboard

Could not start amazon-efs-mount-watchdog

aws-efs-csi-driver
aws-efs-csi-driver copied to clipboard