[BUG] The coredns function is abnormal when edge node disconnected with K8s APIServer and reboot
What happened: I have deployed coredns in the daemonset mode (hostnetwork=ture), so that in the cloud edge autonomous scenario, the pod can resolve domain names normally. When edge node disconnected with K8s APIServer and reboot, coredns can't get ServerVersion and obtain the resource under version v1 of the discovery.k8s.io group or v1beta1 to determine whether to enable k8s EndpointSliceProxying feature。In this case, coredns function is abnormal, edge applications cannot access services through domain names. What you expected to happen: when edge node disconnected with K8s APIServer and reboot, coredns can obtain the necessary information from the yurthub, thus starting normally and providing domain name resolution. How to reproduce it (as minimally and precisely as possible):
- deploy yurthub on K8s 1.22
- deploy coredns in the daemonset mode(hostnetwork=true), and the following environment variables are configured in the blueprint: - name: KUBERNETES_SERVICE_HOST value: 127.0.0.1 - name: KUBERNETES_SERVICE_PORT value: "10268"
- the ip parameter of the dns service of kubelet is set to dummy ip
- deploy two applications to the edge nodes (client app and server app with the incluster service for the server applications)
- disconnect edge node with k8s apiserver
- reboot edge node
- clusterip cannot be resolved within the pod of the client via the server app service domain name. And
Anything else we need to know?: NA others
/kind bug
@doubleblink Thank you for raising issue. The reason is that Yurthub does not support to cache APIGroupResources and server version at present. Would you be able to take over to add this feature for Yurthub?
@doubleblink Thank you for raising issue. The reason is that Yurthub does not support to cache
APIGroupResourcesand server version at present. Would you be able to take over to add this feature for Yurthub?
Simple Method: 1.Obtain the version information of K8S version and the supported discovery group in the case where the module of healthcheck is disconnected and reconnected, and store it in the mount directory of the container volume. 2. Write the useragent into the context of the request for coredns related to non-resource requests in the cache module so that the coredns requests can be processed in the autonomous scenario. 3. The local module responds to the information stored in the first step for the coredns to work normally.
However, I am not familiar with the cache module of the openyurt, so I cannot provide an elegant solution. I will first get familiar with the code of the cache module of openyurt.
@doubleblink Thank you for raising issue. The reason is that Yurthub does not support to cache
APIGroupResourcesand server version at present. Would you be able to take over to add this feature for Yurthub?Simple Method: 1.Obtain the version information of K8S version and the supported discovery group in the case where the module of healthcheck is disconnected and reconnected, and store it in the mount directory of the container volume. 2. Write the useragent into the context of the request for coredns related to non-resource requests in the cache module so that the coredns requests can be processed in the autonomous scenario. 3. The local module responds to the information stored in the first step for the coredns to work normally.
However, I am not familiar with the cache module of the openyurt, so I cannot provide an elegant solution. I will first get familiar with the code of the cache module of openyurt.
@doubleblink ok, i will add a label for this issue, so maybe other members in the community will take over it.
I think I can try to handle this issue in two weeks. If I fix the the bug, I will propose PR.
@Sodawyx Very appreciate for you to take over this work.
/assign @Sodawyx