coreos-kubernetes icon indicating copy to clipboard operation
coreos-kubernetes copied to clipboard

Controller and worker automated installation scripts are missing etcd certificates location

Open kfirufk opened this issue 9 years ago • 17 comments

Hi. I have 2 etcd2 servers configured with tls certificates. And while trying to configure the controller or worker script to work with these servers, there is no place to configure the tls certificates location which causes the script to get into a loop of connection rotation between my etcd2 servers.

Talking about the scripts located at https://github.com/coreos/coreos-kubernetes/tree/master/multi-node/generic

Am I right on my assumptions? If I am... I don't mind patching the script myself and adding the relevant code to support the required certificates, and then I'll upload a patch to this issue. Just want to know that I'm on the right track.

More information at http://stackoverflow.com/questions/39743901/installing-kubernetes-on-coreos-with-rkt-and-automated-script

Thanks!

kfirufk avatar Sep 30 '16 06:09 kfirufk

I think you may have to look in more than one place to fix it but ,for example, flannel needs these values set to work with a secure etcd cluster:

FLANNELD_ETCD_ENDPOINTS=$ETCD_ENDPOINTS
FLANNELD_ETCD_CAFILE=/path_to_ca_file
FLANNELD_ETCD_CERTFILE=/path_to_cert_file
FLANNELD_ETCD_KEYFILE=/path_to_key_file

in controller at line 759

camilb avatar Sep 30 '16 07:09 camilb

Hi. I updated controller-install.sh script with the following new environment variables:

tls cert

ETCD_CERT_FILE="/etc/ssl/etcd/etcd1.pem" ETCD_KEY_FILE="/etc/ssl/etcd/etcd1-key.pem" ETCD_TRUSTED_CA_FILE="/etc/ssl/etcd/ca.pem"

enable/disable etcd tls certificates support

ETCD_CLIENT_CERT_AUTH=true

re-create all configuration file even if they exist

OVERWRITE_ALL_FILES=true

i remember calico needs a hostname based certs

CONTROLLER_HOSTNAME="coreos-2.tux-in.com"

for mountpoint in kubelet

ETCD_CERT_ROOT_DIR="/etc/ssl/etcd"

so.. I'm new to kubernetes. I'd love some feedback. gonna work on worker script now.

controller-install.sh.zip

kfirufk avatar Oct 03 '16 03:10 kfirufk

I noticed that kube-policy-controller fails. trying to find out why.

kfirufk avatar Oct 03 '16 04:10 kfirufk

Related issue: https://github.com/coreos/coreos-kubernetes/issues/694

robszumski avatar Oct 03 '16 06:10 robszumski

@kfirufk I have fixed this for kube-aws in my case. Please check the changes in the attached script. Possibly that you have to mount certs for kubelet also. I have all the certs, including etcd, in "/etc/kubernetes".

controller-install-secure-etcd.sh.zip

camilb avatar Oct 03 '16 15:10 camilb

@camilb - hi. i checked your script, in the kube-policy-controller manifest creation, you specified the location of the etcd ssl certificate files, but you did not create a mount point for it (/etc/etcd/ssl) so it will be available in the container. that's where I'm stuck. how did it work without it ?

kfirufk avatar Oct 03 '16 15:10 kfirufk

@kfirufk Forgot to update the path there, here is my working config:

- path: /etc/kubernetes/manifests/kube-controller-manager.yaml
content: |
  apiVersion: v1
  kind: Pod
  metadata:
    name: kube-controller-manager
    namespace: kube-system
  spec:
    containers:
    - name: kube-controller-manager
      image: {{.HyperkubeImageRepo}}:{{.K8sVer}}
      command:
      - /hyperkube
      - controller-manager
      - --master=http://127.0.0.1:8080
      - --leader-elect=true
      - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
      - --root-ca-file=/etc/kubernetes/ssl/ca.pem
      - --cloud-provider=aws
      resources:
        requests:
          cpu: 200m
      livenessProbe:
        httpGet:
          host: 127.0.0.1
          path: /healthz
          port: 10252
        initialDelaySeconds: 15
        timeoutSeconds: 15
      volumeMounts:
      - mountPath: /etc/kubernetes/ssl
        name: ssl-certs-kubernetes
        readOnly: true
      - mountPath: /etc/ssl/certs
        name: ssl-certs-host
        readOnly: true
    hostNetwork: true
    volumes:
    - hostPath:
        path: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
    - hostPath:
        path: /usr/share/ca-certificates
      name: ssl-certs-host

You can check the controller configuration here: https://github.com/camilb/coreos-kubernetes/blob/1.4.0-ha/multi-node/aws/pkg/config/templates/cloud-config-controller

Some paths are different. Also this is for "kube-aws" tool.

camilb avatar Oct 03 '16 15:10 camilb

@camilb - that's actually what I did more or less, created a volume and a mount point, but after that the kube-controller-manager won't even try to start up. maybe i missed some indentation, or typo.. i'll diff the two and see what's different. thanks

kfirufk avatar Oct 03 '16 15:10 kfirufk

@kfirufk also check the hyperkube version. I think "v1.4.0_coreos.0" is missing the CNI binaries. Try using v1.4.0_coreos.2

camilb avatar Oct 03 '16 15:10 camilb

@camilb - thanks a lot. updated versions and fixed typos and now it works. just please confirm that these are all the rkt containers that needs to be up and running: kube-apiserver,kube-proxy,hyperkube,kube-policy-controller,leader-elector,kube-controller-manager,kube-scheduler.

anyhow attaching the latest version.

controller-install.sh.gz

kfirufk avatar Oct 03 '16 17:10 kfirufk

hi! :) so first.. i had some static certificates in the controller installer so fixed it. and also did the worker installer script. didn't fully test it but it seems ok for now.

controller-install.sh.gz

worker-install.sh.gz

kfirufk avatar Oct 03 '16 18:10 kfirufk

ok.. the worker-install script starts kube-proxy and hyperkube containers. but 'kubectl get node' shows only the master node, not the worker. working on it.

kfirufk avatar Oct 04 '16 08:10 kfirufk

ok i was wrong. the worker script is fine. i provided CONTROLLER_ENDPOINT with a domain without 'https://' prefix.

kfirufk avatar Oct 04 '16 08:10 kfirufk

welp. I already had calico-node configured, so didn't notice it's in the script as well. fixed calico node to work with etcd tls certificates. still didn't verify all the pods functionality but this is what I have so far:

installer-v0.1.tar.gz

kfirufk avatar Oct 08 '16 10:10 kfirufk

decided to create a github project instead of posting here whenever I modify the script. https://github.com/kfirufk/coreos-kubernetes-multi-node-generic-install-script

kfirufk avatar Oct 09 '16 05:10 kfirufk

If you fork this repo, and create a pull request, the differences can easily be seen/reviewed and possibly merged into this repo (a lot easier than sharing zip files, or in an entirely separate project): https://help.github.com/articles/about-pull-requests

aaronlevy avatar Nov 17 '16 02:11 aaronlevy

thanks for the info :) i'll be sure to do so next time. since each etcd2 server has it's own certificate, and I don't really want to force kubernetes to install pods on specific servers, it complicates things.. i need to store the certificates of all the etcd2 servers in the same directory of each server. so instead... I just opened another etcd2 port, http://127.0.0.1:4001 and each service in the same server can use etcd2 by going to this address without the need for a certificate.

kfirufk avatar Jan 26 '17 09:01 kfirufk