kind
kind copied to clipboard
[WIP] Make /proc/sys read-only with carve-outs for some sysctls
As mentioned on #3511 this could be a more complete way to ensure systemd or other components don't change sysctls unexpectedly. This also makes sysfs mountable per #3436 (but that is just the mount of sysfs on /kind/private/sys
, so can easily be split, aside from any naming preferences).
WIP as I'm not sure it's the best option, but possibly better than fragile breakage due to unexpected sysctl changes.
The downside is it needs an allow list of sysctls which is probably going to need additions for other use cases, but it does mean kind can be explicit about what is supported.
The workaround to add a sysctl as writable would be:
docker exec a-node mount --rbind /kind/private/proc/sys/some-sysctl /proc/sys/some-sysctl
(This currently won't support running in some userns configurations yet, but it should be a case of just ignoring the error from mount if it errors (it can work, it depends on the exact userns environment). In a user namespace the host's sysctls can't be modified anyway. I can test userns cases if this option is worth taking further.)
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: dgl Once this PR has been reviewed and has the lgtm label, please assign aojea for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Hi @dgl. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
ensure systemd or other components don't change sysctls unexpectedly
Rootless mode ( https://kind.sigs.k8s.io/docs/user/rootless/ ) almost solves this issue.
As mentioned on https://github.com/kubernetes-sigs/kind/pull/3511 this could be a more complete way to ensure systemd or other components don't change sysctls unexpectedly. This also makes sysfs mountable per https://github.com/kubernetes-sigs/kind/issues/3436 (but that is just the mount of sysfs on /kind/private/sys, so can easily be split, aside from any naming preferences).
I'm really hesitant to ship a change like this because it's hard to say how we'll break users that have come to rely on this over the years and disabling something like udev/binfmt misc on the other hand is cheap and reasonable, at the risk of missing some future systemd behavior.