scope
scope copied to clipboard
Scope didn't show all kubernetes nodes and all namespaces
What you expected to happen?
Kubernetes to show the hosts of all nodes from Kubernetes cluster and to show all namespaces
What happened?
It shows only the master node details and its system pods details.
How to reproduce it?
$ kubectl apply -f "https://cloud.weave.works/k8s/scope.yaml?k8s-version=$(kubectl version | base64 | tr -d '\n')"
$ kubectl port-forward -n weave "$(kubectl get -n weave pod --selector=weave-scope-component=app -o jsonpath='{.items..metadata.name}')" 4040
Anything else we need to know?
Versions:
$ scope version
scope is not installed on this kubernetes
$ docker version
1.13.1
$ uname -a
Linux XXXXX 3.10.0-1062.4.3.el7.x86_64 #1 SMP Wed Nov 13 23:58:53 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version
1.16
Logs:
$ docker logs weavescope
or, if using Kubernetes:
$ kubectl logs <weave-scope-pod> -n <namespace>
time="2020-01-16T08:59:02Z" level=info msg="publishing to: weave-scope-app.weave.svc.cluster.local:80"
--weave=false
option.
--weave=false
option.
--weave=false
option.
I have the same porblem,it is be ok when i restart the hosts.
Thanks for opening the issue @ashoks27!
Could you please give us a bit more info about your Kubernetes cluster?
- Which cloud provider are you using?
- How many nodes does the cluster consist of?
- Is there a Weave Scope probe running per each host? (please paste the output of
kubectl get pods -n weave
andkubectl get nodes
commands) - What are the logs for the Weave Scope app pod?
I have the same problem. I have a functional k8 kluster made with kubeadm running flannel and k8 1.17.2 running on vmware esxi hosts 6.5. The vm:s (masters and workers) have 2 cpu and 8 GB ram and they have a cpu usage of less than 20%. Using podsecurity policys and granted weave namespace pods to "runasroot". No quota in weave namespace Its a multimaster(3) cluster and 3 workers. I access weave-scope via a ingress. I have not seen this before on older versions of k8 (1.16.3 and sles 12) with weave-scope version 1.12. When I access weave-scope I only see one node (weave-scope-agent-rjqpm below) and thats the only agent that can resolve the name, the cluster agent and the weave app can resolve the name but not the remaning 5 agents. I deployed a dnsutil pod in the same namespace and it can resolve the name. https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ Its strange that some pods can resolve but other cant.
I run sles 15 sp1 4.12.14-197.34-default docker version 19.03.5 The cluster
NAME READY STATUS RESTARTS AGE
dnsutils 1/1 Running 1 67m
weave-scope-agent-4vzzj 1/1 Running 1 91m
weave-scope-agent-bjjlw 1/1 Running 1 91m
weave-scope-agent-gjh6w 1/1 Running 1 91m
weave-scope-agent-jt6nn 1/1 Running 1 91m
weave-scope-agent-p7nmv 1/1 Running 1 91m
weave-scope-agent-rjqpm 1/1 Running 0 91m
weave-scope-app-7f44d5786c-82pfq 1/1 Running 1 91m
weave-scope-cluster-agent-b4f45797c-8bgj6 1/1 Running 1 91m
A example log
time="2020-03-05T15:16:33Z" level=info msg="publishing to: weave-scope-app.weave.svc.cluster.local:80"
<probe> INFO: 2020/03/05 15:16:33.669514 Basic authentication disabled
<probe> INFO: 2020/03/05 15:17:23.680591 command line args: --mode=probe --probe-only=true --probe.docker=true --probe.docker.bridge=docker0 --probe.kubernetes.role=host --probe.no-controls=true --probe.publish.interval=4.5s --probe.spy.interval=2s --weave=false weave-scope-app.weave.svc.cluster.local:80
<probe> INFO: 2020/03/05 15:17:23.680648 probe starting, version 1.12.0, ID 1e0006c7acb22bd0
<probe> ERRO: 2020/03/05 15:17:53.760928 Error checking version: Get https://checkpoint-api.weave.works/v1/check/scope-probe?arch=amd64&flag_kernel-version=4.12.14-197.34-default&flag_kubernetes_enabled=true&flag_os=linux&os=linux&signature=wNyNvwpqNYzhya33vqah9m1fGY7y3Y8jupssh5LDItU%3D&version=1.12.0: dial tcp: i/o timeout
<probe> WARN: 2020/03/05 15:18:13.687944 Cannot resolve 'weave-scope-app.weave.svc.cluster.local': lookup weave-scope-app.weave.svc.cluster.local on 10.96.0.10:53: read udp 172.20.95.115:59660->10.96.0.10:53: i/o timeout
<probe> ERRO: 2020/03/05 15:18:23.825219 Error checking version: Get https://checkpoint-api.weave.works/v1/check/scope-probe?arch=amd64&flag_kernel-version=4.12.14-197.34-default&flag_kubernetes_enabled=true&flag_os=linux&os=linux&signature=wNyNvwpqNYzhya33vqah9m1fGY7y3Y8jupssh5LDItU%3D&version=1.12.0: dial tcp: i/o timeout
DNS resolve tests
root@bvin01-k801m-01:/ # kubectl -n weave exec -ti dnsutils -- nslookup weave-scope-app.weave.svc.cluster.local Server: 10.96.0.10 Address: 10.96.0.10#53
Name: weave-scope-app.weave.svc.cluster.local Address: 10.110.216.172
root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-cluster-agent-b4f45797c-8bgj6 -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
Name: weave-scope-app.weave.svc.cluster.local Address 1: 10.110.216.172 weave-scope-app.weave.svc.cluster.local root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-app-7f44d5786c-82pfq -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
Name: weave-scope-app.weave.svc.cluster.local Address 1: 10.110.216.172 weave-scope-app.weave.svc.cluster.local root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-rjqpm -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
Name: weave-scope-app.weave.svc.cluster.local Address 1: 10.110.216.172 weave-scope-app.weave.svc.cluster.local root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-p7nmv -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'weave-scope-app.weave.svc.cluster.local': Try again command terminated with exit code 1 root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-jt6nn -- nslookup kubernetes.default nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'kubernetes.default': Try again command terminated with exit code 1 root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-jt6nn -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'weave-scope-app.weave.svc.cluster.local': Try again command terminated with exit code 1 root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-gjh6w -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'weave-scope-app.weave.svc.cluster.local': Try again command terminated with exit code 1 root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-bjjlw -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'weave-scope-app.weave.svc.cluster.local': Try again command terminated with exit code 1 root@bvin01-k801m-01:/ # kubectl -n weave exec -ti weave-scope-agent-4vzzj -- nslookup weave-scope-app.weave.svc.cluster.local nslookup: can't resolve '(null)': Name does not resolve
nslookup: can't resolve 'weave-scope-app.weave.svc.cluster.local': Try again command terminated with exit code 1
I even made the dnsutil pod to a deployment and tested from all 3 workers and i could do the dns resolve, so the network and dns seems to be working in the cluster.
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutil-dp-75d66fc4bc-dffwm 1/1 Running 0 13s 10.244.3.17 bvin01-k801w-02 <none> <none>
weave-scope-agent-4vzzj 1/1 Running 1 104m 172.20.95.115 bvin01-k801m-03 <none> <none>
weave-scope-agent-bjjlw 1/1 Running 1 104m 172.20.95.125 bvin01-k801w-02 <none> <none>
weave-scope-agent-gjh6w 1/1 Running 1 104m 172.20.95.126 bvin01-k801w-03 <none> <none>
weave-scope-agent-jt6nn 1/1 Running 1 104m 172.20.95.124 bvin01-k801w-01 <none> <none>
weave-scope-agent-p7nmv 1/1 Running 1 104m 172.20.95.94 bvin01-k801m-02 <none> <none>
weave-scope-agent-rjqpm 1/1 Running 0 104m 172.20.95.91 bvin01-k801m-01 <none> <none>
weave-scope-app-7f44d5786c-82pfq 1/1 Running 1 104m 10.244.2.15 bvin01-k801w-01 <none> <none>
weave-scope-cluster-agent-b4f45797c-8bgj6 1/1 Running 1 104m 10.244.2.22 bvin01-k801w-01 <none> <none>
[infra] root@bvin01-k801m-01:/export/home/res/weave # kubectl -n weave exec -ti dnsutil-dp-75d66fc4bc-dffwm -- nslookup weave-scope-app.weave.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: weave-scope-app.weave.svc.cluster.local
Address: 10.110.216.172
root@bvin01-k801m-01:/ # kubectl -n weave delete pod dnsutil-dp-75d66fc4bc-dffwm
pod "dnsutil-dp-75d66fc4bc-dffwm" deleted
root@bvin01-k801m-01:/ # kubectl -n weave get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutil-dp-75d66fc4bc-xhh9x 1/1 Running 0 2m3s 10.244.2.23 bvin01-k801w-01 <none> <none>
weave-scope-agent-4vzzj 1/1 Running 1 107m 172.20.95.115 bvin01-k801m-03 <none> <none>
weave-scope-agent-bjjlw 1/1 Running 1 107m 172.20.95.125 bvin01-k801w-02 <none> <none>
weave-scope-agent-gjh6w 1/1 Running 1 107m 172.20.95.126 bvin01-k801w-03 <none> <none>
weave-scope-agent-jt6nn 1/1 Running 1 107m 172.20.95.124 bvin01-k801w-01 <none> <none>
weave-scope-agent-p7nmv 1/1 Running 1 107m 172.20.95.94 bvin01-k801m-02 <none> <none>
weave-scope-agent-rjqpm 1/1 Running 0 107m 172.20.95.91 bvin01-k801m-01 <none> <none>
weave-scope-app-7f44d5786c-82pfq 1/1 Running 1 107m 10.244.2.15 bvin01-k801w-01 <none> <none>
weave-scope-cluster-agent-b4f45797c-8bgj6 1/1 Running 1 107m 10.244.2.22 bvin01-k801w-01 <none> <none>
root@bvin01-k801m-01:/ # kubectl -n weave exec -ti dnsutil-dp-75d66fc4bc-xhh9x -- nslookup weave-scope-app.weave.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: weave-scope-app.weave.svc.cluster.local
Address: 10.110.216.172
root@bvin01-k801m-01:/ # kubectl -n weave delete pod dnsutil-dp-75d66fc4bc-xhh9x
pod "dnsutil-dp-75d66fc4bc-xhh9x" deleted
root@bvin01-k801m-01:/ # kubectl -n weave get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutil-dp-75d66fc4bc-2b9xk 1/1 Running 0 6m43s 10.244.4.17 bvin01-k801w-03 <none> <none>
weave-scope-agent-4vzzj 1/1 Running 1 119m 172.20.95.115 bvin01-k801m-03 <none> <none>
weave-scope-agent-bjjlw 1/1 Running 2 119m 172.20.95.125 bvin01-k801w-02 <none> <none>
weave-scope-agent-gjh6w 1/1 Running 1 119m 172.20.95.126 bvin01-k801w-03 <none> <none>
weave-scope-agent-jt6nn 1/1 Running 2 119m 172.20.95.124 bvin01-k801w-01 <none> <none>
weave-scope-agent-p7nmv 1/1 Running 1 119m 172.20.95.94 bvin01-k801m-02 <none> <none>
weave-scope-agent-rjqpm 1/1 Running 0 119m 172.20.95.91 bvin01-k801m-01 <none> <none>
weave-scope-app-7f44d5786c-9kn2v 1/1 Running 0 117s 10.244.4.20 bvin01-k801w-03 <none> <none>
weave-scope-cluster-agent-b4f45797c-l8vl5 1/1 Running 0 117s 10.244.4.21 bvin01-k801w-03 <none> <none>```
root@bvin01-k801m-01:/ # kubectl -n weave exec -ti dnsutil-dp-75d66fc4bc-2b9xk -- nslookup weave-scope-app.weave.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: weave-scope-app.weave.svc.cluster.local
Address: 10.110.216.172
@husa570 please open a new issue; managing multiple threads of conversation in a GitHub issue is impossible.
Suggest you look at the dnsPolicy
on your pods - it should be as we have in the example config: https://github.com/weaveworks/scope/blob/7c838affaaa0ca12f8510c2d28f1e1853fa85d2e/examples/k8s/ds.yaml#L49
I have dnspolicy ClusterFirstWithHostNet (since I use the default) in the daemonset. even tried ClusterFirst but with no change. I wont open a issue at the moment since we dropped using weave-scope after spending time with the above troubleshooting. Open a issue means we have to work with the problem but since its seams like a basic problem that maybe our combination of software versions trigger, it will either be solve later on by someone else…. or its just us that have this problems with weave-scope in the fantastic world of kubernetes. Even a delete of weave and namespace and a new deployment in the cluster the error stays the same, so its consistent. We are running several other applications in other namespaces in the cluster without any problems. So from my point of view this issue is a non-issue and weave-scope work as designed, its just we that cant use it.
Same issue on baremetal k8s set up with kubeadm... Tried installing it with the yaml as above, as well as Helm, no difference, DNS errors show up and indeed cannot do nslookup within the agents.
Everything else works fine on this 10 node cluster, including DNS from any other pod. What's even weird in my case is that the nodes come and go... I never have more than 2-3 show up at the same time, but randomly they pop up, then go away.
I change hostNetwork: to false (hostNetwork: false) in the daemonset for tha weave-scope-agent and then it seems to work.
@husa570 I set: hostNetwork: false
Now I get 6 of the 9 nodes to show up, but they are listed under the agent name instead of the host name (weave-scope-agent-XXXXX), and they are still coming and going randomly
The nodes that don't show up say the following in the logs:
10.0.0.200 is the IP of my Traefik Ingress so no idea why it's trying to fetch anything from there with the agents...