coreos-kubernetes icon indicating copy to clipboard operation
coreos-kubernetes copied to clipboard

Kubelet dependencies on host CLI tools

Open robszumski opened this issue 8 years ago • 21 comments

The kubelet calls out to various CLI tools are aren't shipped in CoreOS...things like rbd.

Open questions:

  • Do we need to ship these CLI tools in the kubelet container?
  • If not, do we define a narrower scope of storage that is supported?
  • Do we ship the CLI tools individually in containers as needed and then somehow connect them to the kubelet?

From discussion on IRC, the conformance tests don't really touch this topic. Do we need to do some work upstream on that as well?

cc: @pbx0 @mikedanese

robszumski avatar Feb 23 '16 02:02 robszumski

rbd, iscsiadm seem like the big ones. cc @saad-ali

mikedanese avatar Feb 23 '16 02:02 mikedanese

It seems like we'll need to ship these with our kubelet containers since we don't want OS auto-updates to upgrade these packages underneath the kubelet. How does kubernetes currently test these packages? Also for a given kubernetes release, how would one find out which versions of these CLI tools were tested?

We try to keep our hyperkube builds as close to upstream as possible. Right now we just add a /kubelet symlink to /hyperkube -- kubelet to make it work like a plain kubelet container.

Mike, you mentioned it would be hacky to ship the kubelet in a container but not use --containerized. We currently are doing this. What might be the consequences of this?

Also, in IRC it was mentioned that https://github.com/kubernetes/kubernetes/pull/10176 is related.

peebs avatar Feb 23 '16 02:02 peebs

I'm not sure about non-volume plugin CLI tools, but regarding CLI tool dependencies for volume plugins:

I don't think it makes sense for the underlying OS to ship with all possible binaires required for all volume plugins that k8s supports. At the moment, any volume plugin specific dependencies are the responsibility of the cluster admin to install on nodes. If a cluster admin wants to enable rbd volume plugin support, for example, the instructions are to first install ceph on the nodes. We'd like to improve the UX around this, but I don't think the right answer is for the underlying OS to ship with all possible volume plugin CLI tools.

saad-ali avatar Feb 23 '16 02:02 saad-ali

FWIW when we talk about running the kubelet in a container, we might be doing this slightly differently than others. Namely using "rkt-fly". More or less running the kubelet in an unconstrained chroot, but shipped via an ACI / Docker image: See: https://github.com/coreos/coreos-overlay/blob/master/app-admin/kubelet-wrapper/files/kubelet-wrapper (we don't use --containerized nor mount / --> /rootfs)

So we should be able to include any CLI tools in the hyperkube image if it seems reasonable. But sounds like maybe (in the case of volume plugins) it should still be left up to cluster admins? If that's the case, we may want to demonstrate how this could be accomplished on CoreOS.

aaronlevy avatar Feb 23 '16 02:02 aaronlevy

I don't really have any context on rocket but kubelet swaps out a mounter and writer if it's told it's running in a container. I haven't done a full audit of their usage but the main issue is mounting volumes. Mounts need to propagate form the kubelets mount namespace to the host then back to other containers. If the secrets e2e is working then you should be fine. Does kubelet run in host pid and net? Does rkt api support mount propagation/shared subtrees?

https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/server.go#L134-L138

mikedanese avatar Feb 23 '16 02:02 mikedanese

Or running in host mnt

mikedanese avatar Feb 23 '16 02:02 mikedanese

cc @alban, any idea about mount propagation?

mischief avatar Feb 23 '16 02:02 mischief

Also kubelet needs the host's view of dev (and not a snapshot like docker run --device=[] as block devices will be hotplugged).

mikedanese avatar Feb 23 '16 03:02 mikedanese

The kubelet started by rkt-fly (kubelet-wrapper) runs in the host namespaces (host mnt, host pid, host net, etc.) but in a chroot. If kubelet-wrapper was using a --volume for / -> /rootfs, rkt-fly would set up the /rootfs volume as shared-recursive (but still in the host mnt namespace). So rbd, iscsiadm and others would be able to mount things in /rootfs and the mounts would be propagated on the real / outside of the chroot.

The kubelet started by rkt-fly also has the host's view of /dev because, despite being in a chroot, /dev is bind mounted in the chroot's /dev (see main.go#L256).

/cc @steveeJ

alban avatar Feb 23 '16 10:02 alban

rbdnamer for ceph storage (https://github.com/ceph/ceph-docker/tree/master/examples/coreos/rbdmap)

thereallukl avatar Feb 24 '16 17:02 thereallukl

is there any additional dependency to run rkt (with different stage1 images) instead of docker as container-runtime for kubelet?

thereallukl avatar Feb 24 '16 17:02 thereallukl

@lleszczu Here is the getting started guide but keep in mind there is still some work to do before reaching feature parity

robszumski avatar Feb 24 '16 17:02 robszumski

@robszumski I get the general idea, but as there is an open discussion to put kubelet inside rkt container (https://github.com/coreos/bugs/issues/1051), there might be some additional dependencies to be considered.

thereallukl avatar Feb 24 '16 18:02 thereallukl

@alban sounds like it should work.

Just noticed there are similar requirements for network plugins. If you want to support the various cni plugins, it would be good to plop https://storage.googleapis.com/kubernetes-release/network-plugins/cni-09214926.tar.gz into /opt/bin/cni inside the kubelet image.

mikedanese avatar Feb 24 '16 18:02 mikedanese

Kubernetes slack user @alvin ran into this trying to mount a Gluster volume.

robszumski avatar Apr 19 '16 20:04 robszumski

/proc seems like another path required by kubelet that is not currently accounted for in kubelet-wrapper. Here's an excerpt of kubelet's logs when I tried to run it with rkt fly:

[  382.749400] hyperkube[4]: I0510 01:19:49.767214       4 server.go:683] Watching apiserver
[  382.761948] hyperkube[4]: W0510 01:19:49.779739       4 plugins.go:156] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory
...
[  383.034925] hyperkube[4]: E0510 01:19:50.029177       4 kubelet.go:1016] Failed to start ContainerManager [open /proc/sys/vm/overcommit_memory: read-only file system, open /proc/sys/kernel/panic: read-only file system]
[  383.035156] hyperkube[4]: I0510 01:19:50.029190       4 manager.go:123] Starting to sync pod status with apiserver
[  383.035328] hyperkube[4]: I0510 01:19:50.029206       4 kubelet.go:2356] Starting kubelet main sync loop.
[  383.035485] hyperkube[4]: I0510 01:19:50.029216       4 kubelet.go:2365] skipping pod synchronization - [Failed to start ContainerManager [open /proc/sys/vm/overcommit_memory: read-only file system, open /proc/sys/kernel/panic: read-only file system] container runtime is down]

Here's the exact command I'm running with:

rkt --insecure-options image run --volume etc-kubernetes,kind=host,source=/etc/kubernetes --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates --volume var-lib-docker,kind=host,source=/var/lib/docker --volume var-lib-kubelet,kind=host,source=/var/lib/kubelet --volume os-release,kind=host,source=/usr/lib/os-release --volume run,kind=host,source=/run --mount volume=etc-kubernetes,target=/etc/kubernetes --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=var-lib-docker,target=/var/lib/docker --mount volume=var-lib-kubelet,target=/var/lib/kubelet --mount volume=os-release,target=/etc/os-release --mount volume=run,target=/run docker://gcr.io/google_containers/hyperkube:v1.2.3 --exec /hyperkube -- kubelet --allow-privileged=true --api-servers=http://127.0.0.1:8080 --cadvisor-port=0 --cluster-dns=10.3.0.10 --cluster-domain=cluster.local --config=/etc/kubernetes/manifests --hostname-override=10.0.1.122 --logtostderr=true --register-schedulable=false --v=2

Note that I'm running this directly, not using kubelet-wrapper, because I want to use the official gcr.io hyperkube, not the CoreOS-specific one on quay.io. All the options to rkt in the above command are taken from kubelet-wrapper, though.

jimmycuadra avatar May 10 '16 02:05 jimmycuadra

Related upstream PR to use the kernel-level RDB features instead of shelling out to the CLI tools: https://github.com/kubernetes/kubernetes/issues/23518

robszumski avatar May 19 '16 22:05 robszumski

the image still needs modprobe though ? https://github.com/kubernetes/kubernetes/issues/23924

untoreh avatar Jun 16 '16 10:06 untoreh

I'm using the coreos/hyperkube:v1.8.0_coreos.0 image and have run into the same problem while running the proxy:

time="2017-10-02T14:49:56Z" level=warning msg="Running modprobe ip_vs failed with message: ``, error: exec: \"modprobe\": executable file not found in $PATH"
time="2017-10-02T14:49:56Z" level=error msg="Could not get ipvs family information from the kernel. It is possible that ipvs is not enabled in your kernel. Native loadbalancing will not work until this is fixed."

Are you guys planning to add "modprobe" to the image or should I just bind mount the hosts /sbin/modprobe?

edevil avatar Oct 02 '17 14:10 edevil

I have the same issue - modprobe isn't found in hyperkube container. Tried to bind by adding it to kubelet.service as mount, but it didn't helped.

zxpower avatar Oct 09 '17 07:10 zxpower

I just ran into another problem related to not having "modprobe" in the image: https://github.com/kubernetes/kubernetes/issues/53396.

I've also tried using kubelet-wrapper with official hyperkube images from gcr.io but I still have that problem. Around v1.8.0-alpha.2 -> v1.8.0-alpha.3 the modprobe binary disappeared...

edevil avatar Oct 09 '17 16:10 edevil