terraform-aws-eks-blueprints
terraform-aws-eks-blueprints copied to clipboard
[Bug]: Issue initialising ArgoCD haproxy: layer7 timeout
Welcome to Amazon EKS Blueprints!
- [X] Yes, I've searched similar issues on GitHub and didn't find any.
Amazon EKS Blueprints Release version
4.0.2
What is your environment, configuration and the example used?
Terraform Version: 1.1.7
main.tf
...
module "eks" {
source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=v4.0.2"
cluster_name = local.name
cluster_version = "1.21"
vpc_id = module.vpc.vpc_id
private_subnet_ids = module.vpc.private_subnets
# IPV6
cluster_ip_family = "ipv6"
# EKS MANAGED NODE GROUPS
managed_node_groups = {
mg_5 = {
node_group_name = "managed-ondemand"
instance_types = ["m5.large"]
min_size = "2"
subnet_ids = module.vpc.private_subnets
}
}
}
module "eks-blueprints-kubernetes-addons" {
source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=v4.0.2/modules/kubernetes-addons"
eks_cluster_id = module.eks.eks_cluster_id
enable_ipv6 = true # Enable Ipv6 network. Attaches new VPC CNI policy to the IRSA role
# EKS Managed Add-ons
enable_amazon_eks_vpc_cni = true
enable_amazon_eks_coredns = true
enable_amazon_eks_kube_proxy = true
# K8s Add-ons
enable_argocd = true
enable_aws_load_balancer_controller = true
enable_prometheus = true
depends_on = [module.eks.managed_node_groups]
}
As you can see, the cluster is setup to support IPv6 networking internally.
What did you do and What did you see instead?
Apply the infra as expected. All resources are created successfully, however, all the haproxy pods which go into a persistent crashloop backoff and will not initialise.
Below are logs from one of the backing off pods.
$ kubectl -n argocd logs argo-cd-redis-ha-haproxy-75fb577466-24qzl
[NOTICE] 115/094553 (1) : New worker #1 (7) forked
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_1/R0 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::f893')", check duration: 1001ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_1/R1 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::f893')", check duration: 1001ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_1/R2 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::f893')", check duration: 1000ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 115/094554 (7) : backend 'check_if_redis_is_master_1' has no server available!
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_2/R0 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::92d')", check duration: 1000ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_2/R1 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::92d')", check duration: 1000ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094554 (7) : Server check_if_redis_is_master_2/R2 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'fdf8:cee4:5e1::92d')", check duration: 1001ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 115/094554 (7) : backend 'check_if_redis_is_master_2' has no server available!
[WARNING] 115/094554 (7) : Server bk_redis_master/R1 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'role:master')", check duration: 1000ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094554 (7) : Server bk_redis_master/R2 is DOWN, reason: Layer7 timeout, info: " at step 5 of tcp-check (expect string 'role:master')", check duration: 1000ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING] 115/094604 (1) : Exiting Master process...
[WARNING] 115/094604 (7) : Stopping proxy health_check_http_url in 0 ms.
[WARNING] 115/094604 (7) : Stopping backend check_if_redis_is_master_0 in 0 ms.
[WARNING] 115/094604 (7) : Stopping backend check_if_redis_is_master_1 in 0 ms.
[WARNING] 115/094604 (7) : Stopping backend check_if_redis_is_master_2 in 0 ms.
[WARNING] 115/094604 (7) : Stopping frontend ft_redis_master in 0 ms.
[WARNING] 115/094604 (7) : Stopping backend bk_redis_master in 0 ms.
[WARNING] 115/094604 (7) : Stopping frontend metrics in 0 ms.
[WARNING] 115/094604 (7) : Stopping frontend GLOBAL in 0 ms.
[WARNING] 115/094604 (7) : Proxy health_check_http_url stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy check_if_redis_is_master_0 stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy check_if_redis_is_master_1 stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy check_if_redis_is_master_2 stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy ft_redis_master stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy bk_redis_master stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy metrics stopped (FE: 0 conns, BE: 0 conns).
[WARNING] 115/094604 (7) : Proxy GLOBAL stopped (FE: 0 conns, BE: 0 conns).
[ALERT] 115/094604 (1) : Current worker #1 (7) exited with code 0 (Exit)
[WARNING] 115/094604 (1) : All workers exited. Exiting... (0)
Additional Information
No response
As a temporary workaround, I have disabled Redis HA.
argocd-values.yaml
redis-ha:
enabled: false
Confirmed reproducible, may be related to https://github.com/argoproj/argo-helm/issues/1203 .
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days
This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days
Issue closed due to inactivity.