aws-efs-csi-driver
aws-efs-csi-driver copied to clipboard
Unable to deploy efs-csi-controller to Fargate to support Karpenter-provisioned EKS cluster
/kind bug
What happened?
- I am using Terraform to manage AWS resources.
- I tried to deploy, via Terraform, an EKS cluster with no nodes, but with the EFS CSI Add-On (and others). Nodes to be provisioned by Karpenter. The Karpenter controller itself is deployed to Fargate.
- Karpenter provisions EC2 nodes on demand to run Kubernetes Pods.
- I want the Pods (on EC2, provisioned by Karpenter) to have access to EFS.
- Terraform fails to deploy the EKS cluster because the EFS Add-On never becomes ready (reports status as "Degraded"). I believe this is similar to EBS CSI ISSUE #1801: the controller pods need to be running for the Add-On to report being healthy, but they have no place to run.
- I added a Fargate profile, targeting label
app = "efs-csi-controller"
, so that the EFS controller would be launched to Fargate. - The Add-On still would not become healthy because the communication sockets were not created/available, and still reports status as "Degraded".
- After Karpenter was deployed, it started nodes, and the
efs-csi-node
Daemonset successfully deployed to the EC2 nodes, but theefs-csi-controller
Pods were still in a CrashLoopBackoff and the Add-On still reports status as "Degraded"..
What you expected to happen?
The controller pods would be deployed to Fargate and and work without the Node component, and the Add-On would report status as "Active". As EC2 Nodes were provisioned, controller Pods would work from Fargate while Node Pods worked properly on EC2 Nodes.
How to reproduce it (as minimally and precisely as possible)?
See "What happened" above.
Anything else we need to know?:
The failure that is reported to Kubernetes comes from the efs-plugin
container exiting with an error. IMHO it should not try to run on Fargate, and probably should not be deployed as part of the controller for this reason.
Environment
- Kubernetes version (use
kubectl version
): v1.27.4-eks-2d98532 - Driver version: v1.5.8-eksbuild.1
Please also attach debug logs to help us better diagnose
Log excerpts (each one just keeps repeating the quoted excerpt):
efs-csi-controller csi-provisioner
W0816 04:26:59.779601 1 connection.go:183] Still connecting to unix:///var/lib/csi/sockets/pluginproxy/csi.sock
efs-csi-controller liveness-probe
W0816 04:27:00.989300 1 connection.go:173] Still connecting to unix:///csi/csi.sock
efs-csi-controller efs-plugin
I0816 05:54:46.413768 1 config_dir.go:63] Mounted directories do not exist, creating directory at '/etc/amazon/efs'
I0816 05:54:46.418766 1 metadata.go:63] getting MetadataService...
I0816 05:54:52.757469 1 metadata.go:71] retrieving metadata from Kubernetes API
F0816 05:54:52.773395 1 driver.go:56] could not get metadata: did not find aws instance ID in node providerID string
I also have the same issue. I would like to run the controllers on fargate, and have them attach EFS volumes to actual nodes that are then provisoned by karpenter.
#1195 isn't sufficient for Fargate support. Latest eks addon v1.7.6-eksbuild.1 sets securityContext.privileged: true
for controller pods. This isn't supported by fargate nodes.
Please reopen.
/reopen
@z0rc: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@Nuru could you reopen the ticket please?
/reopen
It looks like the changes in #1195 were necessary, but not sufficient.
@Nuru: Reopened this issue.
In response to this:
/reopen
It looks like the changes in #1195 were necessary, but not sufficient.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Just fall in the same situation, can't deploy the add-on because kube-system is a fargate namespace. Same context = Karpenter + FargateCluster Will switch on the manual installation mode, but that seem a waste of time. Allow controllers to run on fargate would be great, thanks
We're facing the same issue as previous commenter