canal icon indicating copy to clipboard operation
canal copied to clipboard

no route to host from all nodes to all services except to kubernetes service

Open UriZafrir opened this issue 9 months ago • 0 comments

Hi everyone I'm running an RKE cluster. I have a problem in which I get "no route to host" when trying to query services from a node.

k get svc -A
argocd          argo-cd-argocd-applicationset-controller   ClusterIP      10.43.71.196    <none>                          7000/TCP                     9d
argocd          argo-cd-argocd-dex-server                  ClusterIP      10.43.60.116    <none>                          5556/TCP,5557/TCP            9d
argocd          argo-cd-argocd-redis                       ClusterIP      10.43.37.182    <none>                          6379/TCP                     9d
argocd          argo-cd-argocd-repo-server                 ClusterIP      10.43.200.3     <none>                          8081/TCP                     9d
argocd          argo-cd-argocd-server                      ClusterIP      10.43.229.66    <none>                          80/TCP,443/TCP               9d
default         kubernetes                                 ClusterIP      10.43.0.1       <none>                          443/TCP                      9d
ingress-nginx   ingress-nginx-controller                   LoadBalancer   10.43.70.189    172.20.121.173,172.20.121.174   80:30996/TCP,443:32439/TCP   9d
ingress-nginx   ingress-nginx-controller-admission         ClusterIP      10.43.137.222   <none>                          443/TCP                      9d
kube-system     kube-dns                                   ClusterIP      10.43.0.10      <none>                          53/UDP,53/TCP,9153/TCP       9d
kube-system     metrics-server                             ClusterIP      10.43.183.119   <none>                          443/TCP                      7d12h
kubeshark       kubeshark-front                            ClusterIP      10.43.200.80    <none>                          80/TCP                       7d17h
kubeshark       kubeshark-hub                              ClusterIP      10.43.162.11    <none>                          80/TCP                       7d17h
kubeshark       kubeshark-worker-metrics                   ClusterIP      10.43.64.10     <none>                          49100/TCP                    7d17h
telnet  10.43.0.10 53
Trying 10.43.0.10...
telnet: connect to address 10.43.0.10: No route to host
telnet 10.43.229.66 443
Trying 10.43.229.66...
telnet: connect to address 10.43.229.66: No route to host
telnet 10.43.70.189 80
Trying 10.43.70.189...
telnet: connect to address 10.43.70.189: No route to host
telnet 10.43.137.222 443
Trying 10.43.137.222...
telnet: connect to address 10.43.137.222: No route to host

This is the flow of debugging i did: I got this line when using k get pods:

E0519 05:23:36.925419 1110186 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

checking the apiservices i got faileddiscovery check for metrics server:

kubectl get apiservices
v1beta1.metrics.k8s.io                 kube-system/metrics-server   False (FailedDiscoveryCheck)   7d12h

when describing the apiservice i got:

Message: failing or missing response from https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

kubectl describe apiservice v1beta1.metrics.k8s.io
E0519 05:30:38.505746 1113885 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.535446 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.538759 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0519 05:30:38.542372 1113885 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       k8s-app=metrics-server
Annotations:  <none>
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2024-05-11T13:38:43Z
  Resource Version:    1332438
  UID:                 ae69ae9d-f893-400b-b993-7be2e8af833b
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2024-05-11T13:38:43Z
    Message:               failing or missing response from https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.183.119:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

after which i tried to telnet the services and discovered the problem is not only to the metrics-server service.

Would appreciate some assistnce.

Your Environment

  • Calico version: v3.22.5
  • Flannel version: 0.3.1
  • Orchestrator version: kubernetes v1.24.10
  • Operating System and version: CentOS Stream release 9

UriZafrir avatar May 19 '24 07:05 UriZafrir