kubeclipper
kubeclipper copied to clipboard
High latency for svc NodePort access in calico vxlan mode
Describe the Bug A clear and concise description of what the bug is.
k8s:1.23.6 calico: 3.22.4 ,mode: vxlan running Nginx deployment in cluster, service is NodePort 30321
master node:10.0.0.111,10.0.0.112,10.0.0.113
in node 10.0.0.111: telnet 10.0.0.111 30321 High latency 120s telnet 10.0.0.112 30321 Normal telnet 10.0.0.113 30321 Normal
For UI issues please also add a screenshot that shows the issue.
Versions Used Kubernetes: 1.23.6 Kubeclipper: release 1.x
Environment How many nodes and their hardware configuration:
For example: CentOS 7.5 / 3 masters: 8cpu/8g; 3 nodes: 8cpu/16g (and other info are welcomed to help us debugging)
centos 7.9 kernel 5.18
How To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior A clear and concise description of what you expected to happen.
Linux kernel bug in some versions: VXLAN encapsulation of NAT-passed TCP packets generates incorrect UDP checksum, triggered by the iptables utility in KubeProxy's image.
Solution: (Use any one of them)
- do not use vxlan, switch to ipip or bgp
- change the kernel version, such as the tested version 5.6.13, 5.4.41
- turn off the network device rx/tx checksum:
ethtool --offload vxlan.calico rx off tx off
/close
@x893675: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.