terraform-aws-eks-blueprints
terraform-aws-eks-blueprints copied to clipboard
gavinbunney/kubectl provider used in vpc-cni-custom-networking is behaving inconsistently
Description
There is a bug in gavinbunney/kubectl provider which is used in vpc-cni-custom-networking example. The bug seems to be happening only on K8s version 1.27. Also, it looks like that provider is not maintained anymore (example 1, example 2, example 3).
I was following vpc-cni-custom-networking example and I've made no changes to the relevant code, gavinbunney/kubectl provider and kubectl_manifest resources.
I've created a new EKS v1.27 cluster with terraform apply successfully. Then I ran kubectl get eniconfigs.crd.k8s.amazonaws.com to inspect created ENIConfigs and the manifests looked okay. Then I ran terraform plan to check if terraform state is in sync. Terraform state was not in sync, terraform wanted to create those ENIConfigs. I've attached terraform debug logs which state that ENIConfigs resources are being removed from the state because they couldn't be found. Trying to import back those resources fails. After some time terraform plan will produce no changes again (~20,30 minutes after the last interaction with eniconfigs.crd.k8s.amazonaws.com over kubectl).
I've found a similar solution which is using hashicorp/helm provider and an empty generic helm chart. I can open PR with this change for vpc-cni-custom-networking example if it makes sense to switch providers.
Old code:
resource "kubectl_manifest" "eni_config" {
for_each = zipmap(var.azs, var.pods_subnet_ids)
yaml_body = yamlencode({
apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
kind = "ENIConfig"
metadata = {
name = each.key
}
spec = {
securityGroups = [
module.eks.cluster_primary_security_group_id,
module.eks.node_security_group_id,
]
subnet = each.value
}
})
}
New code:
resource "helm_release" "eni_configs" {
name = "eni-configs"
repository = "https://dysnix.github.io/charts"
chart = "raw"
version = "v0.3.2"
values = [
yamlencode({
resources = [
for az, subnet_id in zipmap(var.azs, var.pods_subnet_ids) :
{
apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
kind = "ENIConfig"
metadata = {
name = az
}
spec = {
securityGroups = [
module.eks.cluster_primary_security_group_id,
module.eks.node_security_group_id,
]
subnet = subnet_id
}
}
]
})
]
}
- [x ] ✋ I have searched the open/closed issues and my issue is not listed.
Versions
-
Module version [Required]:
I'm using only examples from this repository. Here is permalink: https://github.com/aws-ia/terraform-aws-eks-blueprints/tree/a3138bd8ee2dba924e01a3fc265b84402a0246bc/examples/vpc-cni-custom-networking -
Terraform version:
Terraform v1.5.2
on darwin_arm64
- Provider version(s):
Terraform v1.5.2
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v5.3.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.3.2
+ provider registry.terraform.io/hashicorp/helm v2.10.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.21.1
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
Reproduction Code [Required]
terraform applykubectl get eniconfigs.crd.k8s.amazonaws.comterraform plan-> kubectl resources are dropped from the state
Expected behaviour
terraform plan is consistent every time and ENIConfigs do not disappear from the state.
Actual behaviour
terraform plan is not consistent every time and ENIConfigs disappear from the state.
Terminal Output Screenshot(s)
Terraform plan debug logs (just the interesting part):
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1b) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1c) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1a) not found, removing from state
2023-07-04T18:42:27.207+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"] during refresh.
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.210+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN] Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.211+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/gavinbunney/kubectl/1.14.0/darwin_arm64/terraform-provider-kubectl_v1.14.0 pid=16928
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1c\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1a\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1b\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"]
Additional context
I too was severely hit by this. the recommended approach is to move away from that provider and instead use manifests. i tried but had trouble to get ENIConfig to deploy
I too was severely hit by this. the recommended approach is to move away from that provider and instead use manifests. i tried but had trouble to get ENIConfig to deploy
Have you tried helm_release? I've tested it on a new cluster and everything works as expected. Unlike official kubernetes, helm provider doesn't require CRDs to be available during plan time.
If this is an okay approach I would be happy to make a PR with changes.
helm_release is somewhat limited (you either go with any-resource helm chart or you face the limitations around templating) and has some other bugs linked to it as well (sporadic random changes for example). Original linked issue has been adressed in my fork (https://github.com/alekc/terraform-provider-kubectl/releases/tag/v2.0.2), you can replace provider definition and have a go, it should be completely compatible (and as a side effect several other bugs like wait for deletion has been fixed and released as well).
@fbozic I've been experimenting with helm_release and it has worked out quite well. There might be hidden issues I have not uncovered but so far so good. I am inclined to removing kubectl and replacing with helm_release and a local chart. Let me know if you are still open to a PR with a caveat that it would have to done across the board for all examples we have today in this repo.
@askulkarni2 I'll try to open a PR next week once I'm back from PTO.
Just wondering... Anything preventing to use the kubernetes_manifest from the official kubernetes provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/manifest ? 🤔
Just wondering... Anything preventing to use the kubernetes_manifest from the official kubernetes provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/manifest ? 🤔
CRD has to be on a cluster during the terraform plan. That is not the case with ENIConfig during the first terraform run.
Oh right. Thanks @fbozic.
We will be working on updating the examples provided here shortly. The approach we will be using is as follows:
- If the manifest is used simply as a demonstration, it will be converted into standard YAML file and the pattern's instruction updated to reflect a
kubectl apply -f <file>.yaml - If the manifest is required for the pattern, it will be converted into a local Helm chart so that the
helm_releaseresource can be utilized to provision. To be fair, this is a workaround, but it has shown to work fairly well and it fits in with the EKS Blueprints module which deploys via thehelm_releaseresource
@bryantbiggs I've started working on a fix for this in the forked repo. Are you going to pick this up and should I stop working on that?
We ended up just applying the cni configs in the inputs for the EKS addon https://github.com/defenseunicorns/terraform-aws-eks/blob/main/main.tf#L79-L121
additional logic here to create them conditionally but the idea is you can feed them into cluster_addons.vpc_cni.configuration_values.eniconfig instead of doing it from terraform