edgemesh icon indicating copy to clipboard operation
edgemesh copied to clipboard

部署EdgeMesh后, 无法使用NodePort访问edge上的Pod

Open Chang-Tao opened this issue 6 months ago • 0 comments

What happened:
master01 不可以使用容器组IP访问部署到Edge上的Nginx Pod worker01 可以使用容器组IP访问部署到Edge上的Nginx Pod worker02 不可以使用容器组IP访问部署到Edge上的Nginx Pod edgenode 可以使用容器组IP访问部署到Edge上的Nginx Pod

What you expected to happen:
我希望部署到edge node的Pod可以通过nodeport访问

Environment:

  • EdgeMesh version: 1.9.0

  • Kubernetes version (use kubectl version): 1.27.16

  • KubeEdge version(e.g. cloudcore --version and edgecore --version): 1.9.4

  • Cloud nodes Environment:
    • Hardware configuration (e.g. lscpu): root@k8s-master01:~# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 45 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) Silver 4214R CPU @ 2.40GHz CPU family: 6 Model: 85 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 Stepping: 7 BogoMIPS: 4788.74 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl x topology tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdra nd hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat pku ospke avx512_vnni md_clear flush_l1d arch_capabilities Virtualization features:
      Hypervisor vendor: VMware Virtualization type: full Caches (sum of all):
      L1d: 256 KiB (8 instances) L1i: 256 KiB (8 instances) L2: 8 MiB (8 instances) L3: 16.5 MiB (1 instance) NUMA:
      NUMA node(s): 1 NUMA node0 CPU(s): 0-7

    • OS (e.g. cat /etc/os-release): root@k8s-master01:~# cat /etc/os-release PRETTY_NAME="Ubuntu 22.04.4 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.4 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=jammy

    • Kernel (e.g. uname -a): root@k8s-master01:~# uname -a Linux k8s-master01 5.15.0-118-generic #128-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

    • Go version (e.g. go version): No

    • Others: k8s-master01 k8s-worker01 k8s-worker02 k8s-edge01

  • Edge nodes Environment:
    • edgecore version (e.g. edgecore --version): root@k8s-edge01:~# edgecore --version KubeEdge v1.9.4
    • Hardware configuration (e.g. lscpu): root@k8s-edge01:~# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 45 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) Silver 4214R CPU @ 2.40GHz CPU family: 6 Model: 85 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 Stepping: 7 BogoMIPS: 4788.74 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon nopl x topology tsc_reliable nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdra nd hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat pku ospke avx512_vnni md_clear flush_l1d arch_capabilities Virtualization features:
      Hypervisor vendor: VMware Virtualization type: full Caches (sum of all):
      L1d: 256 KiB (8 instances) L1i: 256 KiB (8 instances) L2: 8 MiB (8 instances) L3: 16.5 MiB (1 instance) NUMA:
      NUMA node(s): 1 NUMA node0 CPU(s): 0-7
    • OS (e.g. cat /etc/os-release): root@k8s-edge01:~# cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=22.04 DISTRIB_CODENAME=jammy DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"
    • Kernel (e.g. uname -a): root@k8s-master01:~# uname -a Linux k8s-master01 5.15.0-118-generic #128-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
    • Go version (e.g. go version): No
    • Others:

edgemesh

环境配置

虚拟机

OS 角色 CPU 内存 IP
Ubuntu 22.04 LTS K8s-Master01 8 Core 16Gb 192.168.101.211
Ubuntu 22.04 LTS K8s-Worker01 8 Core 16Gb 192.168.101.212
Ubuntu 22.04 LTS K8s-Worker02 8 Core 16Gb 192.168.101.213
Ubuntu 22.04 LTS K8s-Edge01 8 Core 16Gb 192.168.101.214

软件

组件 版本
KubeSphere v3.4.1
Kubernetes v1.27.16
Kubeedge v1.9.4
EdgeMesh v1.9.0
EdgeCore v1.9.4
CloudCore v19.4

现象

当我按照此文档[edgemesh](https://www.kubesphere.io/zh/blogs/kubesphere-integrate-kubeedge/#%E9%83%A8%E7%BD%B2-edgemesh)配置完edgemesh后就发现部分节点只能通过容器组IP访问Pod,无法通过NodePort进行访问, 详情如下:
master01  不可以使用容器组IP访问部署到Edge上的Nginx Pod
worker01  可以使用容器组IP访问部署到Edge上的Nginx Pod
worker02  不可以使用容器组IP访问部署到Edge上的Nginx Pod
edgenode  可以使用容器组IP访问部署到Edge上的Nginx Pod

相关日志

kubeedge命名空间pod状态
    root@k8s-master01:~# kubectl get pod -n kubeedge
    NAME                                     READY   STATUS    RESTARTS      AGE
    cloud-iptables-manager-8t57k             1/1     Running   2 (21h ago)   45h
    cloud-iptables-manager-m85kn             1/1     Running   1 (21h ago)   45h
    cloudcore-545655598f-hqs2m               1/1     Running   0             7h7m
    cloudcore-545655598f-lpt4m               1/1     Running   5 (21h ago)   21h
    edgemes-26drkv-agent-2xk6x               1/1     Running   1 (21h ago)   21h
    edgemes-26drkv-agent-5zc68               1/1     Running   6 (21h ago)   21h
    edgemes-26drkv-agent-r7427               1/1     Running   2 (21h ago)   21h
    edgemes-26drkv-server-667b9f54dc-wzk8v   1/1     Running   1 (21h ago)   21h
    edgeservice-844cb67dd9-7lxfm             1/1     Running   1 (21h ago)   45h
    edgeservice-844cb67dd9-p2kjn             1/1     Running   0             20h
edgenode pod状态
    root@k8s-master01:~# kubectl get pod -n ai-cloud
    NAME                        READY   STATUS    RESTARTS   AGE
    nginx-v1-6886cb97bf-n2njt   1/1     Running   0          20h
pod基本配置
    root@k8s-master01:~# kubectl describe pod nginx-v1-6886cb97bf-n2njt -n ai-cloud
    Name:             nginx-v1-6886cb97bf-n2njt
    Namespace:        ai-cloud
    Priority:         0
    Service Account:  default
    Node:             edgenode-qfur/192.168.101.214
    Start Time:       Fri, 09 Aug 2024 21:30:31 +0800
    Labels:           app=nginx
                      pod-template-hash=6886cb97bf
                      version=v1
    Annotations:      cni.projectcalico.org/ipv4pools: ["default-ipv4-ippool"]
                      kubesphere.io/creator: admin
                      kubesphere.io/imagepullsecrets: {}
    Status:           Running
    IP:               172.17.0.2
    IPs:
      IP:           172.17.0.2
    Controlled By:  ReplicaSet/nginx-v1-6886cb97bf
    Containers:
      container-kdnbab:
        Container ID:   docker://884a60b51263acfa90e6f0d1c85fe21e60dfeef05cd5a6c702e2ac66013f284f
        Image:          nginx:stable-perl
        Image ID:       docker-pullable://nginx@sha256:bb99ae95b8ce6a10d397d0b8998cfe12ac055baabd917be9e00cd095991b8630
        Port:           80/TCP
        Host Port:      0/TCP
        State:          Running
          Started:      Fri, 09 Aug 2024 21:30:31 +0800
        Ready:          True
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /etc/localtime from host-time (ro)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kg7f5 (ro)
    Conditions:
      Type           Status
      Initialized    True 
      Ready          True 
      PodScheduled   True 
    Volumes:
      host-time:
        Type:          HostPath (bare host directory volume)
        Path:          /etc/localtime
        HostPathType:  
      kube-api-access-kg7f5:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   BestEffort
    Node-Selectors:              kubernetes.io/hostname=edgenode-qfur
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:                      <none>
    root@k8s-master01:~# kubectl get pod nginx-v1-6886cb97bf-n2njt -n ai-cloud -o=jsonpath='{.status.podIP} {.status.phase} {.status.containerStatuses[*].restartCount}'
    172.17.0.2 Running 0
    root@k8s-master01:~# 
服务配置
    root@k8s-master01:~# kubectl get services -n ai-cloud
    NAME    TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
    nginx   NodePort   10.233.23.240   <none>        80:30926/TCP   20h

    root@k8s-master01:~# kubectl describe service nginx -n ai-cloud
    Name:                     nginx
    Namespace:                ai-cloud
    Labels:                   app=nginx
                              version=v1
    Annotations:              kubesphere.io/creator: admin
                              kubesphere.io/serviceType: statelessservice
    Selector:                 app=nginx
    Type:                     NodePort
    IP Family Policy:         SingleStack
    IP Families:              IPv4
    IP:                       10.233.23.240
    IPs:                      10.233.23.240
    Port:                     tcp-80  80/TCP
    TargetPort:               80/TCP
    NodePort:                 tcp-80  30926/TCP
    Endpoints:                172.17.0.2:80
    Session Affinity:         None
    External Traffic Policy:  Cluster
    Events:                   <none>
从master节点连接nginx状态
    root@k8s-master01:~# curl 10.233.23.240:80
    curl: (7) Failed to connect to 10.233.23.240 port 80 after 3060 ms: No route to host
从worker01节点连接nginx状态
    root@k8s-worker01:~# curl 10.233.23.240:80
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>
从worker02节点连接nginx状态
    root@k8s-worker02:~# curl 10.233.23.240:80
    curl: (7) Failed to connect to 10.233.23.240 port 80 after 3061 ms: No route to host
从edge01节点连接nginx状态
    root@k8s-edge01:~# curl 10.233.23.240:80
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>
在各个节点通过nodeport连接状态
    root@k8s-master01:~# curl 192.168.101.211:30926
    curl: (7) Failed to connect to 192.168.101.211 port 30926 after 3069 ms: No route to host
    root@k8s-master01:~# curl 192.168.101.212:30926
    curl: (7) Failed to connect to 192.168.101.212 port 30926 after 3056 ms: No route to host
    root@k8s-master01:~# curl 192.168.101.213:30926
    curl: (7) Failed to connect to 192.168.101.213 port 30926 after 3074 ms: No route to host
    root@k8s-master01:~# curl 192.168.101.214:30926
    curl: (7) Failed to connect to 192.168.101.214 port 30926 after 0 ms: Connection refused
在各个节点通过nodeport连接状态
    root@k8s-worker01:~# curl 192.168.101.211:30926
    curl: (7) Failed to connect to 192.168.101.211 port 30926 after 3078 ms: No route to host
    root@k8s-worker01:~# curl 192.168.101.212:30926
    curl: (7) Failed to connect to 192.168.101.212 port 30926 after 3063 ms: No route to host
    root@k8s-worker01:~# curl 192.168.101.213:30926
    curl: (7) Failed to connect to 192.168.101.213 port 30926 after 3076 ms: No route to host
    root@k8s-worker01:~# curl 192.168.101.214:30926
    curl: (7) Failed to connect to 192.168.101.214 port 30926 after 0 ms: Connection refused
在各个节点通过nodeport连接状态
    root@k8s-worker02:~# curl 10.233.23.240:80
    curl: (7) Failed to connect to 10.233.23.240 port 80 after 3061 ms: No route to host
    root@k8s-worker02:~# curl 192.168.101.211:30926
    curl: (7) Failed to connect to 192.168.101.211 port 30926 after 3080 ms: No route to host
    root@k8s-worker02:~# curl 192.168.101.212:30926
    curl: (7) Failed to connect to 192.168.101.212 port 30926 after 3068 ms: No route to host
    root@k8s-worker02:~# curl 192.168.101.213:30926
    curl: (7) Failed to connect to 192.168.101.213 port 30926 after 3057 ms: No route to host
    root@k8s-worker02:~# curl 192.168.101.214:30926
    curl: (7) Failed to connect to 192.168.101.214 port 30926 after 0 ms: Connection refused
在各个节点通过nodeport连接状态
    root@k8s-edge01:~# curl 192.168.101.211:30926
    curl: (7) Failed to connect to 192.168.101.211 port 30926 after 3059 ms: No route to host
    root@k8s-edge01:~# curl 192.168.101.212:30926
    curl: (7) Failed to connect to 192.168.101.212 port 30926 after 3055 ms: No route to host
    root@k8s-edge01:~# curl 192.168.101.213:30926
    curl: (7) Failed to connect to 192.168.101.213 port 30926 after 3076 ms: No route to host
    root@k8s-edge01:~# curl 192.168.101.214:30926
    curl: (7) Failed to connect to 192.168.101.214 port 30926 after 0 ms: Connection refused

我想要部署到边缘节点的pod能够通过边缘节点的IP+Port访问到服务,目前部署到这里不知道该如何继续排查问题, 请大佬指点,感谢!

Chang-Tao avatar Aug 10 '24 09:08 Chang-Tao