coreos-kubernetes
coreos-kubernetes copied to clipboard
API server isn't connectable on 10.3.0.1 after manual setup
I've set up Kubernetes on CoreOS beta (version 1353.4.0), according to the official guide, but the API server isn't working properly after. At least it's not connectable on https://10.3.0.1, which causes for example the DNS addon to fail. I can't connect to https://10.3.0.1 from within pods either, for example via wget.
I'm seeing errors like this as a result in the kube-dns pod:
E0419 22:06:50.952662 1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0419 22:07:20.444655 1 dns.go:174] DNS server not ready, retry in 500 milliseconds
F0419 22:07:20.944622 1 dns.go:168] Timeout waiting for initialization
Had this issue using single-node setup script.
We found that there were a couple problems:
kube-dnsDeployment & Pods were not running. Onlykube-dns-autoscaler.kube-dns-autoscalerhad errors in the log trying to talk tokube-apiserverkube-apiserverwas not contactable on the10.3.0.1Service IP.
To fix it on our cluster (YMMV), here is what we did:
-
Add file
/etc/kubernetes/manifests/kube-dns.yamlapiVersion: extensions/v1beta1 kind: Deployment metadata: generation: 1 labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" name: kube-dns namespace: kube-system spec: replicas: 1 strategy: rollingUpdate: maxSurge: 10% maxUnavailable: 0 type: RollingUpdate template: metadata: annotations: scheduler.alpha.kubernetes.io/critical-pod: "" creationTimestamp: null labels: k8s-app: kube-dns spec: containers: - args: - --domain=cluster.local. - --dns-port=10053 - --config-dir=/kube-dns-config - --v=2 env: - name: PROMETHEUS_PORT value: "10055" image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /healthcheck/kubedns port: 10054 scheme: HTTP initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: kubedns ports: - containerPort: 10053 name: dns-local protocol: UDP - containerPort: 10053 name: dns-tcp-local protocol: TCP - containerPort: 10055 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readiness port: 8081 scheme: HTTP initialDelaySeconds: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi - args: - -v=2 - -logtostderr - -configDir=/etc/k8s/dns/dnsmasq-nanny - -restartDnsmasq=true - -- - -k - --cache-size=1000 - --log-facility=- - --server=/cluster.local/127.0.0.1#10053 - --server=/in-addr.arpa/127.0.0.1#10053 - --server=/ip6.arpa/127.0.0.1#10053 image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /healthcheck/dnsmasq port: 10054 scheme: HTTP initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: dnsmasq ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP resources: requests: cpu: 150m memory: 20Mi - args: - --v=2 - --logtostderr - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /metrics port: 10054 scheme: HTTP initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: sidecar ports: - containerPort: 10054 name: metrics protocol: TCP resources: requests: cpu: 10m memory: 20Mi dnsPolicy: Default nodeSelector: node-role.kubernetes.io/master: "" restartPolicy: Always terminationGracePeriodSeconds: 30 -
kubectl apply -f /etc/kubernetes/manifests/kube-dns.yaml -
Edit file
/etc/systemd/system/kubelet.service.
Addkubeletoptions:--cloud-provider=aws --node-labels node-role.kubernetes.io/master=,[Service] ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/usr/bin/mkdir -p /opt/cni/bin Environment=KUBELET_IMAGE_TAG=v1.5.4_coreos.0 Environment=KUBELET_IMAGE_URL=quay.io/coreos/hyperkube Environment="RKT_RUN_ARGS=--uuid-file-save=/var/run/kubelet-pod.uuid --volume dns,kind=host,source=/etc/resolv.conf --mount volume=dns,target=/etc/resolv.conf --volume rkt,kind=host,source=/opt/bin/host-rkt --mount volume=rkt,target=/usr/bin/rkt --volume var-lib-rkt,kind=host,source=/var/lib/rkt --mount volume=var-lib-rkt,target=/var/lib/rkt --volume stage,kind=host,source=/tmp --mount volume=stage,target=/tmp --volume var-log,kind=host,source=/var/log --mount volume=var-log,target=/var/log " ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/usr/bin/mkdir -p /var/log/containers ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid ExecStart=/usr/lib/coreos/kubelet-wrapper --api-servers=http://127.0.0.1:8080 --cni-conf-dir=/etc/kubernetes/cni/net.d --network-plugin=cni --container-runtime=docker --rkt-path=/usr/bin/rkt --rkt-stage1-image=coreos.com/rkt/stage1-coreos --register-node=true --allow-privileged=true --pod-manifest-path=/etc/kubernetes/manifests --hostname-override=172.17.0.53 --cluster_dns=10.3.0.10 --cluster_domain=cluster.local --cloud-provider=aws --node-labels node-role.kubernetes.io/master=, ExecStop=-/usr/bin/rkt stop --uuid-file=/var/run/kubelet-pod.uuid Restart=always RestartSec=10 KillMode=process [Install] WantedBy=multi-user.target -
Edit file
/etc/kubernetes/manifests/kube-apiserver.yaml.
Addkube-apiserveroption:--advertise-address=0.0.0.0:apiVersion: v1 kind: Pod metadata: name: kube-apiserver namespace: kube-system spec: hostNetwork: true containers: - name: kube-apiserver image: quay.io/coreos/hyperkube:v1.5.4_coreos.0 command: - /hyperkube - apiserver - --bind-address=0.0.0.0 - --etcd-servers=http://127.0.0.1:2379 - --allow-privileged=true - --service-cluster-ip-range=10.3.0.0/24 - --secure-port=443 - --advertise-address=0.0.0.0 - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem - --client-ca-file=/etc/kubernetes/ssl/ca.pem - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem - --runtime-config=extensions/v1beta1/networkpolicies=true - --anonymous-auth=false livenessProbe: httpGet: host: 127.0.0.1 port: 8080 path: /healthz initialDelaySeconds: 15 timeoutSeconds: 15 ports: - containerPort: 443 hostPort: 443 name: https - containerPort: 8080 hostPort: 8080 name: local volumeMounts: - mountPath: /etc/kubernetes/ssl name: ssl-certs-kubernetes readOnly: true - mountPath: /etc/ssl/certs name: ssl-certs-host readOnly: true volumes: - hostPath: path: /etc/kubernetes/ssl name: ssl-certs-kubernetes - hostPath: path: /usr/share/ca-certificates name: ssl-certs-host -
systemctl daemon-reloadto tell SystemD to re-readkubelet.servicefile. -
systemctl restart kubeletto restart it. It may not add node label correctly in some versions, but at least it's supposed to. If it works for you, the last step may not be necessary. -
Fix node label on master node.
This is sokube-dnspods will run on the master nodes withnodeSelector: node-role.kubernetes.io/master: "".
SSH to the node, or replace$(hostname)with yours:kubectl patch node $(hostname) -p '{"metadata":{"labels":{"node-role.kubernetes.io/master":""}}}'