kubeclipper icon indicating copy to clipboard operation
kubeclipper copied to clipboard

High latency for svc NodePort access in calico vxlan mode

Open zhuzhenfan opened this issue 3 years ago • 1 comments

Describe the Bug A clear and concise description of what the bug is.

k8s:1.23.6 calico: 3.22.4 ,mode: vxlan running Nginx deployment in cluster, service is NodePort 30321

master node:10.0.0.111,10.0.0.112,10.0.0.113

in node 10.0.0.111: telnet 10.0.0.111 30321 High latency 120s telnet 10.0.0.112 30321 Normal telnet 10.0.0.113 30321 Normal

For UI issues please also add a screenshot that shows the issue.

Versions Used Kubernetes: 1.23.6 Kubeclipper: release 1.x

Environment How many nodes and their hardware configuration:

For example: CentOS 7.5 / 3 masters: 8cpu/8g; 3 nodes: 8cpu/16g (and other info are welcomed to help us debugging)

centos 7.9 kernel 5.18

How To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

zhuzhenfan avatar Sep 05 '22 06:09 zhuzhenfan

Linux kernel bug in some versions: VXLAN encapsulation of NAT-passed TCP packets generates incorrect UDP checksum, triggered by the iptables utility in KubeProxy's image.

Solution: (Use any one of them)

  1. do not use vxlan, switch to ipip or bgp
  2. change the kernel version, such as the tested version 5.6.13, 5.4.41
  3. turn off the network device rx/tx checksum: ethtool --offload vxlan.calico rx off tx off

zhuzhenfan avatar Sep 08 '22 14:09 zhuzhenfan

/close

x893675 avatar Jul 05 '23 08:07 x893675

@x893675: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubeclipper-bot avatar Jul 05 '23 08:07 kubeclipper-bot