kube-apiserver fails to start when no parent IP is discovered
We use hostname -I to get an IP for the host and then stash it in a parent_ip file. This file is then catted out to provide the --advertise-address for the kube-apiserver when we run it in the rootlesskit NS. If the host has no non-local addresses, this file will be empty and causes an error when we attempt to start the kube-apiserver service.
https://github.com/rootless-containers/usernetes/blob/7546d83bba882f40d22ccc2d784231b4e140654f/boot/rootlesskit.sh#L19 https://github.com/rootless-containers/usernetes/blob/7546d83bba882f40d22ccc2d784231b4e140654f/boot/rootlesskit.sh#L46 https://github.com/rootless-containers/usernetes/blob/7546d83bba882f40d22ccc2d784231b4e140654f/boot/kube-apiserver.sh#L20
Given that we are allowing the cluster members to speak on 127.0.0.1 in the rootlesskit NS (e.g. kubelet uses a config generated by install.sh which uses that for the master and node IPs), it seems like we should be able to set the --advertise-address to be the slirp tap address (slirp CIDR from install.sh + 100 per the slirp man page) rather than a host IP external to the NS. We could also add --bind-address=0.0.0.0 to ensure the apiserver can accept connections if addressed to the slirp tap address (10.0.42.100) or the cni address (10.88.0.1) (from pods with a service account?).
The parent_ip file is also used to set the --public-ip for flanneld but I'm not familiar enough with how it should work (in general or for u7s) to comment on if that could go away as well. I suspect a similar change to make it use the slirp address might be adequate.
I'm using a hand modified kube-apiserver.sh to advertise 10.0.42.100 and bind 0.0.0.0 and things seem to be working fine for me so far. I'll report back if I notice any odd behaviours. I'm running on top of crio at the moment and the 201808 release bundle.
The parent IP needs to be the host IP for multi-node cluster
Would you be able to elaborate on how multi-node works with u7s at the moment, @AkihiroSuda ? I'm using it standalone and probably don't understand enough about the use cases for multi-node to be able to talk about if/how we could fix my issue. Would it be acceptable to add a simple fallback for when hostname -I spits out no IPs to use the slirp IP instead (and print a warning perhaps)?
On multi-node setup, kubelet and kube-proxy needs the value of hostname -I of the kube-apiserver. Flannel also needs that for multi-node VXLAN.
I'm using it standalone
Does this mean your host doesn't have any NIC? (I assume not)
If hostname -I doesn't work on your distro, maybe we should use ip or ifconfig instead to obtain the host IP.
hostname -I as a utility is fine, but if I'm running without an active network connection it can't get any non-local IPs and spits out an empty line. This happens if I'm playing with u7s while I have my wifi off, for example.
My question about the multi-node stuff was more about how you currently see it being used by people. The only info I could spot for multi-node stuff was in the docker-compose example - presumably those containers need to advertise their public IPs, but is there a sensible use case for using u7s in a multi-node setup on physical hosts? Having only done single-node stuff with the install.sh script, I'm just not familiar enough with what the goals of this project are to comment sensibly. :)
The final goal is to secure prod clusters, which obviously need multinode setup
Okay, good to know. Obviously using the slirpns IP is right out then, that makes much more sense to me know.
I think we could do a few things to improve this then, let me know which ones you think are worthwhile and I can make some PRs for you.
- Quote the subshell expansion where we
cat parent_ip- this will avoid any issues with argument injection if that file is malformed/modified - Check the content of
parent_ipfile before we interpolate it for command line arguments.[ -n "${PARENT_IP}" ] && [[ "${PARENT_IP}" =~ ' ' ]]should do the trick I think - Add
U7S_PARENT_IPtousernetes/envwhen it is created byinstall.shand document it if necessary - this would mean changing the unset assignment inboot/rootlesskit.shto be a default value expansion like${U7S_PARENT_IP:-$(hostname -I | ...)}
Anything else?
Let me close this, as the architecture was changed in "Generation 2": https://github.com/rootless-containers/usernetes/releases/tag/gen2-v20230906.0