openyurt
openyurt copied to clipboard
分享节点池serviceTopology流量拓扑功能适配cilium-cni的问题及解决方案
What would you like to be added:
分享兼容cilium-cni的流量闭环方案
Why is this needed:
cilium的networkpolicy及流量控制能力是flannel不具有的
others /kind feature
特性
| annotation Key | annotation Value | 说明 |
|---|---|---|
| openyurt.io/topologyKeys | kubernetes.io/hostname | 流量被路由到相同的节点 |
| openyurt.io/topologyKeys | openyurt.io/nodepool | 流量被路由到相同的节点池 |
参考文档: https://openyurt.io/zh/docs/user-manuals/network/service-topology
https://kubeedge.io/blog/enable-cilium/#kubeedge-edgecore-setup
openyurt版本: 1.5.0
os: debian12
k8s版本: 1.31
准备工作
- 必要: k8s版本>1.18, 在1.21之后的版本endpointSlice被移除featureGate, 不需要特别处理
- 配置kube-proxy使用in-cluster设置连接yurt-hub
$ kubectl edit cm -n kube-system kube-proxy
apiVersion: v1
data:
config.conf: |-
clientConnection:
#kubeconfig: /var/lib/kube-proxy/kubeconfig.conf # 2. comment this line.
qps: 0
clusterCIDR: 10.244.0.0/16
configSyncPeriod: 0s
- 必要: 确认yurt-hub正常运行,
- 必要: yurthub 组件依赖于 yurt-manager 来批准 csr
- 必要: 创建节点池
$ cat << EOF | kubectl apply -f -
apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
name: fujian
spec:
type: Cloud
---
apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
name: wuhan
spec:
type: Edge
---
apiVersion: apps.openyurt.io/v1alpha1
kind: NodePool
metadata:
name: wuqing
spec:
type: Edge
EOF
- 必要: 将节点加入到节点池中(通过打label的方式)
# kubectl get nodepool
NAME TYPE READYNODES NOTREADYNODES AGE
fujian Cloud 2 0 7d21h
wuhan Edge 2 0 7d21h
wuqing Edge 2 0 7d21h
# kubectl get nb
NAME NUM-NODES AGE
fujian-7rxsj8 2 7d21h
wuhan1 2 7d17h
wuqing-cb6rvn 2 7d21h
创建测试工作负载
- svc
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
annotations:
openyurt.io/topologyKeys: openyurt.io/nodepool
labels:
app: busy-box
name: busy-box-svc
spec:
ports:
- port: 3000
protocol: TCP
targetPort: 3000
selector:
app: busy-box
type: ClusterIP
EOF
- yas
apiVersion: apps.openyurt.io/v1beta1
kind: YurtAppSet
metadata:
name: example
namespace: default
resourceVersion: "4501951"
uid: 16c5e569-366c-4fd9-b2e2-379cf8ce8317
spec:
nodepoolSelector:
matchLabels:
yurtappset.openyurt.io/type: nginx
pools:
- wuhan
- wuqing
- fujian
workload:
workloadTemplate:
deploymentTemplate:
metadata:
labels:
app: busy-box
spec:
replicas: 2
selector:
matchLabels:
app: busy-box
template:
metadata:
labels:
app: busy-box
spec:
containers:
- command:
- nc
- -lk
- -p
- "3000"
- -e
- /bin/hostname
- -i
image: busybox
imagePullPolicy: Always
name: busy-box
ports:
- containerPort: 3000
resources: {}
测试结果(不能实现流量闭环)
- 确认缓存和iptables的宿主设置都正确
cat /etc/kubernetes/cache/kube-proxy/endpointslices.v1.discovery.k8s.io/default/busy-box-svc-7cgbp
...
"endpoints":[
{"addresses":["192.168.3.45"],"conditions":{"ready":true,"serving":true,"terminating":false},"targetRef":{"kind":"Pod","namespace":"default","name":"example-wuqing-pd9fn-88589f6bf-hqdqd","uid":"2c1388df-847e-4087-ab56-a4a8351d8ab8"},"nodeName":"tj-wq2-lzytest-0002"},
{"addresses":["192.168.4.192"],"conditions":{"ready":true,"serving":true,"terminating":false},"targetRef":{"kind":"Pod","namespace":"default","name":"example-wuqing-pd9fn-88589f6bf-gr9g2","uid":"a0b07a2a-fe14-4068-b299-483c8a7a477f"},"nodeName":"tj-wq2-lzytest-0001"}]
KUBE-SVC-PZIRA6MO24RJXLWV tcp -- anywhere 10.103.185.179 /* default/busy-box-svc cluster IP */ tcp dpt:3000
Chain KUBE-SVC-PZIRA6MO24RJXLWV (1 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !192.168.0.0/16 10.103.185.179 /* default/busy-box-svc cluster IP */ tcp dpt:3000
KUBE-SEP-IQNXXFW6DAPHIZPB all -- anywhere anywhere /* default/busy-box-svc -> 192.168.3.45:3000 */ statistic mode random probability 0.50000000000
KUBE-SEP-3AAZEJEAEB3WZLEG all -- anywhere anywhere /* default/busy-box-svc -> 192.168.4.192:3000 */
- 从宿主上telnet clusterIP实现流量闭环没有问题, 只会连接到节点池内的pod, 而在容器里不可以
telnet 10.103.185.179 3000
Trying 10.103.185.179...
Connected to 10.103.185.179.
Escape character is '^]'.
192.168.3.45
telnet 10.103.185.179 3000
Trying 10.103.185.179...
Connected to 10.103.185.179.
Escape character is '^]'.
192.168.4.192
Connection closed by foreign host.
- 猜测是因为cilium的原因, 将clusterIP流量劫持了, 直接通过ebpf转发, 而不是iptables规则, 即使开启了kube-proxy也不可以,
https://github.com/cilium/cilium/issues/28904#issuecomment-1804545547
Services:
- ClusterIP: Enabled
- NodePort: Disabled
- LoadBalancer: Disabled
- externalIPs: Disabled
- HostPort: Disabled
kubectl -n kube-system exec ds/cilium -- cilium-dbg service list
21 10.103.185.179:3000 ClusterIP 1 => 192.168.5.235:3000 (active)
2 => 192.168.3.45:3000 (active)
3 => 192.168.0.170:3000 (active)
4 => 192.168.2.181:3000 (active)
5 => 192.168.1.160:3000 (active)
6 => 192.168.4.192:3000 (active)
- 经测试, --set loadBalancer.serviceTopology=true调整cilium参数也无效, 因为底层还是基于endpointSlice运行
解决方案
- 因为kube-edge社区支持cilium, 所以参考他们的方案发现了需要单独部署cilium和cilium-edge, cilium-edge需要连接yurt-hub
https://kubeedge.io/blog/enable-cilium/#kubeedge-edgecore-setup
### Dump original Cilium DaemonSet configuration
> kubectl get ds -n kube-system cilium -o yaml > cilium-edgecore.yaml
### Edit and apply the following patch
> vi cilium-edgecore.yaml
### Deploy cilium-agent aligns with edgecore
> kubectl apply -f cilium-edgecore.yaml
diff --git a/cilium-edgecore.yaml b/cilium-edgecore.yaml
index bff0f0b..3d941d1 100644
--- a/cilium-edgecore.yaml
+++ b/cilium-edgecore.yaml
@@ -8,7 +8,7 @@ metadata:
app.kubernetes.io/name: cilium-agent
app.kubernetes.io/part-of: cilium
k8s-app: cilium
- name: cilium
+ name: cilium-kubeedge
namespace: kube-system
spec:
revisionHistoryLimit: 10
@@ -29,6 +29,12 @@ spec:
k8s-app: cilium
spec:
affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: node-role.kubernetes.io/edge
+ operator: Exists
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
@@ -39,6 +45,8 @@ spec:
containers:
- args:
- --config-dir=/tmp/cilium/config-map
+ - --k8s-api-server=127.0.0.1:10550
+ - --auto-create-cilium-node-resource=true
- --debug
command:
- cilium-agent
@@ -178,7 +186,9 @@ spec:
dnsPolicy: ClusterFirst
hostNetwork: true
initContainers:
- - command:
+ - args:
+ - --k8s-api-server=127.0.0.1:10550
+ command:
- cilium
- build-config
env:
- 参考上面的改动对cilium进行cilium-edge改造
kubectl get ds -n kube-system cilium -o yaml > cilium-edgecore.yaml
1. 改变cilium的env, 使其连接yurt-hub, 10268端口为https端口
# from
env:
- name: KUBERNETES_SERVICE_HOST
value: {{APISERVER_EXTERNAL_IP}}
- name: KUBERNETES_SERVICE_PORT
value: "6443"
# to (每一个contianer都要改)
env:
- name: KUBERNETES_SERVICE_HOST
value: 169.254.2.1
- name: KUBERNETES_SERVICE_PORT
value: "10268"
# 偷懒是人类进步的阶梯(我说的)
sed -i '/- name: KUBERNETES_SERVICE_HOST/{n; s/value:.*/value: 169.254.2.1/;}' 3.yaml
sed -i '/- name: KUBERNETES_SERVICE_PORT/{n; s/value:.*/value: "10268"/;}' 3.yaml
2. 改变deployment的名称
- name: cilium
+ name: cilium-edge
3. (可跳过)改变部分container的启动参数, 应该是env的优先级更高, 可以二选一改动
containers:
- args:
- --auto-create-cilium-node-resource=true
initContainers:
- command:
- cilium-dbg
- build-config
- k8s-api-server=http://127.0.0.1:10261
4. 设置亲和性, cilium-edge只调度到边缘节点, cilium只调度到云节点
# 边 cilium-edge
nodeSelector:
kubernetes.io/os: linux
+ openyurt.io/is-edge-worker: "true"
# 云 cilium
nodeSelector:
kubernetes.io/os: linux
+ openyurt.io/is-edge-worker: "false"
- 社区ranbom-ch提供了一种思路, 使用yurt-hub的data filter的过滤功能
你看下这个文档:[https://openyurt.io/zh/docs/user-manuals/resource-access-control/](https://openyurt.io/zh/docs/user-manuals/resource-access-control/)
配置好之后,重启一下cilium就可以了。
kubectl -n kube-system get cm yurt-hub-cfg -o yaml
apiVersion: v1
data:
cache_agents: ""
discardcloudservice: ""
masterservice: ""
+ servicetopology: cilium,cilium-agent
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: yurthub
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-12-27T02:20:08Z"
labels:
app.kubernetes.io/instance: yurthub
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: yurthub
app.kubernetes.io/version: v1.5.0
helm.sh/chart: yurthub-1.5.0
name: yurt-hub-cfg
namespace: kube-system
- 改动完成后重启cilium可以看到cilium获得了对应节点池的endpoint
17 10.100.140.155:3000 ClusterIP 1 => 192.168.3.37:3000 (active)
2 => 192.168.4.244:3000 (active)
17 10.100.140.155:3000 ClusterIP 1 => 192.168.0.189:3000 (active)
2 => 192.168.5.46:3000 (active)
# master节点由于没有安装yurt-hub, 因此还是可以看到所有的ep
ID Frontend Service Type Backend
1 10.100.140.155:3000 ClusterIP 1 => 192.168.3.251:3000 (active)
2 => 192.168.2.196:3000 (active)
3 => 192.168.5.85:3000 (active)
4 => 192.168.4.202:3000 (active)
5 => 192.168.0.119:3000 (active)
6 => 192.168.1.102:3000 (active)
- 进入容器telnet 也可以获得对应的解析效果
kubectl exec -it example-wuqing-pd9fn-88589f6bf-58x7b -- telnet 10.100.140.155 3000
Connected to 10.100.140.155
192.168.4.244
Connection closed by foreign host
command terminated with exit code 1
kubectl exec -it example-wuqing-pd9fn-88589f6bf-58x7b -- telnet 10.100.140.155 3000
Connected to 10.100.140.155
192.168.3.37
Connection closed by foreign host
command terminated with exit code 1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.