microk8s
microk8s copied to clipboard
Connections initiated by Pods failing often due to ephemeral port mismatch
Summary
TCP and UDP connections initiated by Pods time out/fail often (~50% of time). I tracked this network "instability" to unexpected ephemeral ports used, not matching the OS ephemeral port range and the external firewall configuration.
IOW, despite the OS having:
# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768 60999
...the containers were using source ports in a much wider range (>1024? >0?)
I'm still unclear of is how can the (S)NAT component be configured to use a source port range per the OS syctl
. I did not find a relevant setting in the general CNI settings or in the calico portmap docs.
I verified there are no iptables
rules doing NAT
What Should Happen Instead?
Connections should succeed all the time.
Reproduction Steps
With fexternal irewall only allowing syn-ack packets to ephemeral ports in the system (> 32768).
From a pod, run curl 142.250.184.238
(an IP address of google.com). From the outside run tcpdump
to confirm that the source ports used by the connections are sometimes failing outside of the ephemeral port range:
# tcpdump -i eno1 -n host 142.250.184.238 and "tcp[tcpflags] & (tcp-syn) != 0 and tcp[tcpflags] & (tcp-ack) =0"
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:37:13.987895 IP 144.76.196.2.13686 > 142.250.184.238.80: Flags [S], seq 2626848918, win 64240, options [mss 1460,sackOK,TS val 493799354 ecr 0,nop,wscale 7], length 0
16:37:15.828422 IP 144.76.196.2.13686 > 142.250.184.238.80: Flags [S], seq 699943171, win 64240, options [mss 1460,sackOK,TS val 493801194 ecr 0,nop,wscale 7], length 0
16:37:16.760399 IP 144.76.196.2.2545 > 142.250.184.238.80: Flags [S], seq 2944446947, win 64240, options [mss 1460,sackOK,TS val 493802126 ecr 0,nop,wscale 7], length 0
16:37:17.563713 IP 144.76.196.2.10702 > 142.250.184.238.80: Flags [S], seq 1157060071, win 64240, options [mss 1460,sackOK,TS val 493802929 ecr 0,nop,wscale 7], length 0
16:37:18.427493 IP 144.76.196.2.61052 > 142.250.184.238.80: Flags [S], seq 1670527954, win 64240, options [mss 1460,sackOK,TS val 493803793 ecr 0,nop,wscale 7], length 0
You can see above the source ports are {13686, 2545, 10702, 61052}
- i.e. not within the 32768 60999
range.
Introspection Report
(the tar.gz includes too much private data for me to be comfortable sharing it)
# snap list | grep microk8s
microk8s v1.26.3 4959 1.26/stable canonical** classic
# microk8s status
microk8s is running
high-availability: no
datastore master nodes: 127.0.0.1:19001
datastore standby nodes: none
addons:
enabled:
cert-manager # (core) Cloud native certificate management
community # (core) The community addons repository
dashboard # (core) The Kubernetes dashboard
dns # (core) CoreDNS
ha-cluster # (core) Configure high availability on the current node
helm # (core) Helm - the package manager for Kubernetes
helm3 # (core) Helm 3 - the package manager for Kubernetes
hostpath-storage # (core) Storage class; allocates storage from host directory
ingress # (core) Ingress controller for external access
metrics-server # (core) K8s Metrics Server for API access to service metrics
storage # (core) Alias to hostpath-storage add-on, deprecated
disabled:
argocd # (community) Argo CD is a declarative continuous deployment for Kubernetes.
cilium # (community) SDN, fast with full network policy
dashboard-ingress # (community) Ingress definition for Kubernetes dashboard
fluentd # (community) Elasticsearch-Fluentd-Kibana logging and monitoring
gopaddle-lite # (community) Cheapest, fastest and simplest way to modernize your applications
inaccel # (community) Simplifying FPGA management in Kubernetes
istio # (community) Core Istio service mesh services
jaeger # (community) Kubernetes Jaeger operator with its simple config
kata # (community) Kata Containers is a secure runtime with lightweight VMS
keda # (community) Kubernetes-based Event Driven Autoscaling
knative # (community) Knative Serverless and Event Driven Applications
kwasm # (community) WebAssembly support for WasmEdge (Docker Wasm) and Spin (Azure AKS WASI)
linkerd # (community) Linkerd is a service mesh for Kubernetes and other frameworks
multus # (community) Multus CNI enables attaching multiple network interfaces to pods
nfs # (community) NFS Server Provisioner
ondat # (community) Ondat is a software-defined, cloud native storage platform for Kubernetes.
openebs # (community) OpenEBS is the open-source storage solution for Kubernetes
openfaas # (community) OpenFaaS serverless framework
osm-edge # (community) osm-edge is a lightweight SMI compatible service mesh for the edge-computing.
portainer # (community) Portainer UI for your Kubernetes cluster
sosivio # (community) Kubernetes Predictive Troubleshooting, Observability, and Resource Optimization
traefik # (community) traefik Ingress controller
trivy # (community) Kubernetes-native security scanner
gpu # (core) Automatic enablement of Nvidia CUDA
host-access # (core) Allow Pods connecting to Host services smoothly
kube-ovn # (core) An advanced network fabric for Kubernetes
mayastor # (core) OpenEBS MayaStor
metallb # (core) Loadbalancer for your Kubernetes cluster
minio # (core) MinIO object storage
observability # (core) A lightweight observability stack for logs, traces and metrics
prometheus # (core) Prometheus operator for monitoring and logging
rbac # (core) Role-Based Access Control for authorisation
registry # (core) Private image registry exposed on localhost:32000
# cat /var/snap/microk8s/current/args/cni-network/10-calico.conflist
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"nodename_file_optional": true,
"log_file_path": "/var/log/calico/cni/cni.log",
"datastore_type": "kubernetes",
"nodename": "rblmon23",
"mtu": 0,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/var/snap/microk8s/current/args/cni-network/calico-kubeconfig"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
Can you suggest a fix?
Update (external) firewall rules to allow a wider range of source ports (>1024)
Are you interested in contributing with a fix?
Yes
+1
We faced this issue in AWS enviourment today and took really long to figure this out that it wasn't using defualt ephemeral ports
+1
Hi @sanjeevpandey19 @b44rawat, MicroK8s is not doing anything in particular for this issue, but I'm also a bit out my depth about the specifics here. MicroK8s is not messing with this configs in any way, so probably something to discuss/raise with Calico? You might be able to get more help there.
Hi @sanjeevpandey19 @b44rawat, MicroK8s is not doing anything in particular for this issue, but I'm also a bit out my depth about the specifics here. MicroK8s is not messing with this configs in any way, so probably something to discuss/raise with Calico? You might be able to get more help there.
Ok @neoaggelos thanks for the reply, i will check if i get something related to this which might be causing it