calico icon indicating copy to clipboard operation
calico copied to clipboard

Calico kube controller pod failling on rhel 9 , but working on rhel 7

Open saku3071 opened this issue 1 year ago • 2 comments

Calico kube controller pod failling on rhel 9 , but working on rhel 7

version 3.26.1

we moving from cis aws ami rhel 7 to cis aws ami rhel 9 with same eks bootstrap code , however calico kube controllers pods were failing on rhel 9 only

ens-addon and core dns addon were failing too due to this .


pod on eks logs:

2024-06-19 12:58:09.101 [INFO][1] main.go 107: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0619 12:58:09.102541       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2024-06-19 12:58:09.102 [INFO][1] main.go 131: Ensuring Calico datastore is initialized
2024-06-19 12:58:39.132 [ERROR][1] client.go 295: Error getting cluster information config ClusterInformation="default" error=Get "https://172.20.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 172.20.0.1:443: i/o timeout
2024-06-19 12:58:39.133 [INFO][1] main.go 138: Failed to initialize datastore error=Get "https://172.20.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 172.20.0.1:443: i/o timeout
2024-06-19 12:59:09.121 [ERROR][1] client.go 295: Error getting cluster information config ClusterInformation="default" error=Get "https://172.20.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded
2024-06-19 12:59:09.121 [INFO][1] main.go 138: Failed to initialize datastore error=Get "https://172.20.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": context deadline exceeded
2024-06-19 12:59:09.122 [FATAL][1] main.go 151: Failed to initialize Calico datastore

rhel 9 iptables version 1.8 legacy

Does calico works on rhel9 ? any specfiic iptables rules or issues ? or it works on specific iptables version only like 1.46 in rhel 7 ?

saku3071 avatar Jun 20 '24 08:06 saku3071

We are on latest calico 3.28 version and still facing same issue.

saku3071 avatar Jun 24 '24 10:06 saku3071

I'm not aware of any issues with RHEL or the environment you describe.

The error you're seeing is Calico being unable to communicate with the kube-apiserver (I think via a Service IP) - potentially an issue with the node's underlying network access to the API server, or with kube-proxy service rules.

caseydavenport avatar Jul 01 '24 22:07 caseydavenport

Facing same issue during calico 3.27 to 3.28 upgrade in AWS EKS 1.28 cluster

uttammeena avatar Jul 08 '24 05:07 uttammeena

same issue with kubernetes 1.23.4, calico v3.25.2 , rhel 9.2

tangkelu avatar Jul 11 '24 14:07 tangkelu

Any progrese on this issue? Any more information that could help?

tomastigera avatar Sep 10 '24 16:09 tomastigera

same issue with kubernetes 1.23.4, calico v3.25.2 , rhel 9.2

work after close firewalld service or add pod cidr to trust zone.

tangkelu avatar Sep 11 '24 01:09 tangkelu

work after close firewalld service or add pod cidr to trust zone.

@saku3071 can you please update if this fixes your issue as well?

coutinhop avatar Sep 24 '24 17:09 coutinhop

Sounds like this is an issue with firewalld (or similar) blocking network access for Kubernetes pods. Going to close this for now - disabling firewalld or configuring it to allow pod traffic should fix this. Please shout if still encountering issues after adjusting your firewall.

caseydavenport avatar Oct 22 '24 16:10 caseydavenport