terway
terway copied to clipboard
terway-controlplane pod crashed
使用 https://github.com/AliyunContainerService/terway/tree/v1.13.2/charts/terway/templates/terway-controlplane 部署 terway-controlplane 后发现 cilium-operator pod 频繁重启最终导致 terway-controlplane pod crash:
cilium-operator 日志
root@192-168-31-110 <k8s-test>:~# docker logs k8s_cilium-operator_terway-controlplane-7bfc44db74-q5nw2_kube-system_38949b13-5be3-472a-b830-668df9114540_0
level=info msg="Using config from file" file-path=/etc/config/cilium-config.yaml subsys=config
level=warning msg="Running Cilium with \"kvstore\"=\"\" requires identity allocation via CRDs. Changing identity-allocation-mode to \"crd\"" subsys=config
level=info msg=" --bgp-announce-lb-ip='false'" subsys=cilium-operator-generic
level=info msg=" --bgp-config-path='/var/lib/cilium/bgp/config.yaml'" subsys=cilium-operator-generic
level=info msg=" --ces-max-ciliumendpoints-per-ces='100'" subsys=cilium-operator-generic
level=info msg=" --ces-slice-mode='cesSliceModeIdentity'" subsys=cilium-operator-generic
level=info msg=" --cilium-endpoint-gc-interval='5m0s'" subsys=cilium-operator-generic
level=info msg=" --cilium-pod-labels='k8s-app=cilium'" subsys=cilium-operator-generic
level=info msg=" --cilium-pod-namespace=''" subsys=cilium-operator-generic
level=info msg=" --cluster-id='0'" subsys=cilium-operator-generic
level=info msg=" --cluster-name='default'" subsys=cilium-operator-generic
level=info msg=" --cluster-pool-ipv4-cidr=''" subsys=cilium-operator-generic
level=info msg=" --cluster-pool-ipv4-mask-size='24'" subsys=cilium-operator-generic
level=info msg=" --cluster-pool-ipv6-cidr=''" subsys=cilium-operator-generic
level=info msg=" --cluster-pool-ipv6-mask-size='112'" subsys=cilium-operator-generic
level=info msg=" --cmdref=''" subsys=cilium-operator-generic
level=info msg=" --cnp-node-status-gc-interval='2m0s'" subsys=cilium-operator-generic
level=info msg=" --cnp-status-cleanup-burst='20'" subsys=cilium-operator-generic
level=info msg=" --cnp-status-cleanup-qps='10'" subsys=cilium-operator-generic
level=info msg=" --cnp-status-update-interval='1s'" subsys=cilium-operator-generic
level=info msg=" --config='/etc/config/cilium-config.yaml'" subsys=cilium-operator-generic
level=info msg=" --config-dir=''" subsys=cilium-operator-generic
level=info msg=" --debug='false'" subsys=cilium-operator-generic
level=info msg=" --disable-cnp-status-updates='false'" subsys=cilium-operator-generic
level=info msg=" --disable-endpoint-crd='false'" subsys=cilium-operator-generic
level=info msg=" --enable-cilium-endpoint-slice='false'" subsys=cilium-operator-generic
level=info msg=" --enable-ipv4='true'" subsys=cilium-operator-generic
level=info msg=" --enable-ipv4-egress-gateway='false'" subsys=cilium-operator-generic
level=info msg=" --enable-ipv6='true'" subsys=cilium-operator-generic
level=info msg=" --enable-k8s-api-discovery='false'" subsys=cilium-operator-generic
level=info msg=" --enable-k8s-endpoint-slice='true'" subsys=cilium-operator-generic
level=info msg=" --enable-k8s-event-handover='false'" subsys=cilium-operator-generic
level=info msg=" --enable-local-redirect-policy='false'" subsys=cilium-operator-generic
level=info msg=" --enable-metrics='false'" subsys=cilium-operator-generic
level=info msg=" --gops-port='0'" subsys=cilium-operator-generic
level=info msg=" --identity-allocation-mode='kvstore'" subsys=cilium-operator-generic
level=info msg=" --identity-gc-interval='10m'" subsys=cilium-operator-generic
level=info msg=" --identity-gc-rate-interval='1m0s'" subsys=cilium-operator-generic
level=info msg=" --identity-gc-rate-limit='2500'" subsys=cilium-operator-generic
level=info msg=" --identity-heartbeat-timeout='20m'" subsys=cilium-operator-generic
level=info msg=" --ingress-lb-annotation-prefixes='service.beta.kubernetes.io,service.kubernetes.io,cloud.google.com'" subsys=cilium-operator-generic
level=info msg=" --instance-tags-filter=''" subsys=cilium-operator-generic
level=info msg=" --ipam='cluster-pool'" subsys=cilium-operator-generic
level=info msg=" --k8s-api-server=''" subsys=cilium-operator-generic
level=info msg=" --k8s-heartbeat-timeout='30s'" subsys=cilium-operator-generic
level=info msg=" --k8s-kubeconfig-path=''" subsys=cilium-operator-generic
level=info msg=" --k8s-namespace='kube-system'" subsys=cilium-operator-generic
level=info msg=" --k8s-service-proxy-name=''" subsys=cilium-operator-generic
level=info msg=" --kvstore=''" subsys=cilium-operator-generic
level=info msg=" --kvstore-lease-ttl='15m0s'" subsys=cilium-operator-generic
level=info msg=" --kvstore-opt=''" subsys=cilium-operator-generic
level=info msg=" --leader-election-lease-duration='15s'" subsys=cilium-operator-generic
level=info msg=" --leader-election-renew-deadline='10s'" subsys=cilium-operator-generic
level=info msg=" --leader-election-retry-period='2s'" subsys=cilium-operator-generic
level=info msg=" --limit-ipam-api-burst='4'" subsys=cilium-operator-generic
level=info msg=" --limit-ipam-api-qps='20'" subsys=cilium-operator-generic
level=info msg=" --log-driver=''" subsys=cilium-operator-generic
level=info msg=" --log-opt=''" subsys=cilium-operator-generic
level=info msg=" --nodes-gc-interval='0s'" subsys=cilium-operator-generic
level=info msg=" --operator-api-serve-addr='localhost:9234'" subsys=cilium-operator-generic
level=info msg=" --operator-pprof='false'" subsys=cilium-operator-generic
level=info msg=" --operator-pprof-port='6061'" subsys=cilium-operator-generic
level=info msg=" --operator-prometheus-serve-addr=':9963'" subsys=cilium-operator-generic
level=info msg=" --parallel-alloc-workers='50'" subsys=cilium-operator-generic
level=info msg=" --remove-cilium-node-taints='true'" subsys=cilium-operator-generic
level=info msg=" --set-cilium-is-up-condition='true'" subsys=cilium-operator-generic
level=info msg=" --skip-cnp-status-startup-clean='false'" subsys=cilium-operator-generic
level=info msg=" --skip-crd-creation='false'" subsys=cilium-operator-generic
level=info msg=" --subnet-ids-filter=''" subsys=cilium-operator-generic
level=info msg=" --subnet-tags-filter=''" subsys=cilium-operator-generic
level=info msg=" --synchronize-k8s-nodes='true'" subsys=cilium-operator-generic
level=info msg=" --synchronize-k8s-services='true'" subsys=cilium-operator-generic
level=info msg=" --unmanaged-pod-watcher-interval='15'" subsys=cilium-operator-generic
level=info msg=" --version='false'" subsys=cilium-operator-generic
level=info msg="Cilium Operator 1.12.7 cb6b0ca 2024-11-19T19:05:45+08:00 go version go1.21.5 linux/amd64" subsys=cilium-operator-generic
level=info msg="Establishing connection to apiserver" host="https://10.96.0.1:443" subsys=k8s
level=info msg="Starting apiserver on address localhost:9234" subsys=cilium-operator-api
level=info msg="Connected to apiserver" subsys=k8s
level=info msg="CRD (CustomResourceDefinition) is installed and up-to-date" name=CiliumIdentity/v2 subsys=k8s
level=info msg="CRD (CustomResourceDefinition) is installed and up-to-date" name=CiliumExternalWorkload/v2 subsys=k8s
level=info msg="CRD (CustomResourceDefinition) is installed and up-to-date" name=CiliumEndpoint/v2 subsys=k8s
level=info msg="CRD (CustomResourceDefinition) is installed and up-to-date" name=CiliumClusterwideNetworkPolicy/v2 subsys=k8s
level=info msg="CRD (CustomResourceDefinition) is installed and up-to-date" name=CiliumNetworkPolicy/v2 subsys=k8s
level=info msg="attempting to acquire leader lease kube-system/cilium-operator-resource-lock..." subsys=klog
level=info msg="successfully acquired lease kube-system/cilium-operator-resource-lock" subsys=klog
level=info msg="Leading the operator HA deployment" subsys=cilium-operator-generic
level=info msg="Starting to garbage collect stale CiliumEndpoint custom resources" subsys=cilium-operator-generic
level=info msg="Starting CRD identity garbage collector" interval=10m0s subsys=cilium-operator-generic
level=info msg="Leader election lost" operator-id=192-168-31-110-PGjqcvCCJW subsys=cilium-operator-generic
另外请教一下 terway-controlplane 的作用是什么?配置项中的 controllers 分别是什么意思?daemonMode 使用 ENIMultiIP 并且 enableTrunk = true 时,controllers 该如何设置?
kubernetes version: 1.32.0 docker version: 20.10.3 OS:ubuntu 20.04, kernel 5.4.0-200-generic terway-controlplane replicas: 1
controller 主要服务于下面两个方面:
- 中心化 IPAM https://github.com/AliyunContainerService/terway/blob/main/docs/centralized-ipam.md
- Pod 维度配置 https://github.com/AliyunContainerService/terway/blob/main/docs/terway-trunk.md