Remove explicit sysctl fs.inotify.max_user_watches setting
Since Linux 5.11-rc1, fs.inotify.max_user_watches is dynamically computed up to 1048576 with regards to the addressable physical memory: https://github.com/torvalds/linux/commit/92890123749bafc317bbfacbe0a62ce08d78efb7 .
I suggest removing the current explicit setting to a lower maximum value in favor of using the kernel's default smart way that can provide memory gains on smaller nodes which wouldn't require a high value there.
Tablecloth math from the above-linked commit makes me understand that on a 64bits host with 64GB fs.inotify.max_user_watches would be set to the currently hard coded 524288.
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: ajoga / name: Aurélien Joga (e6aa1e746fda0d05fc8c9db8bfd73af8f7ffd19b)
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign olemarkus for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Welcome @ajoga!
It looks like this is your first PR to kubernetes/kops 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes/kops has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @ajoga. Thanks for your PR.
I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
I'll sign the CLA later I do not have access to the device needed right away
/ok-to-test
I understand there's a change to be done in upstream fnotify, not right here. I've made a PR there (https://github.com/fsnotify/fsnotify/pull/708), I'll see the outcome and update this PR accordingly
well in fact no i can just do both at the same time, I took out the changes I committed here that are part of fnotify, let's see
/retest
/test pull-kops-aws-distro-al2023 /test pull-kops-aws-distro-rhel9
/test pull-kops-aws-distro-rhel9
/test pull-kops-aws-distro-rhel9
/restest I think we can skip the rhel9 failing test.
kOps has support for distros with pretty old kernels. Any idea how far back this change was back-ported? For example, is it part of RHEL 8, 9 and AmazonLinux 2, 2023? I don't thin I am that worried about Ubuntu, Debian, Flatcar, COS.
kOps has support for distros with pretty old kernels. Any idea how far back this change was back-ported? For example, is it part of RHEL 8, 9 and AmazonLinux 2, 2023? I don't thin I am that worried about Ubuntu, Debian, Flatcar, COS.
Mh, this isn't a concern I anticipated, good point.
I'm not sure where too look at for reliable information for RHEL, I do not have access to their subscription-walled resources ; however I could find that it seems the change was backported in the kernels for Centos8 and 9 prior to their depreciation:
-
https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-8/-/blob/c8s/fs/notify/inotify/inotify_user.c?ref_type=heads#L819
-
https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/blob/main/fs/notify/inotify/inotify_user.c?ref_type=heads#L839
-
Amazon Linux 2 version 2.0.20211223.0, 2.0.20250201 -> kernel-4.14 -> NO
-
Amazon Linux 2 version 2.0.20250808 -> kernel-5.10.240-238.959 -> yes
-
Amazon Linux 1 is EOL so didn't check https://docs.aws.amazon.com/linux/al2/ug/compare-with-al1.html
I don't thin I am that worried about Ubuntu, Debian, Flatcar, COS.
Do you want me to look into this too or are you saying we don't care?
I'd hate to be the source of a backward breakage, and it may be sensible to not do this change at this time, so feel free to close the PR if you see it that way too.
FWIW, you may be able to the kernel versions for RHEL releases: https://access.redhat.com/articles/3078.
Do you want me to look into this too or are you saying we don't care?
I'd hate to be the source of a backward breakage, and it may be sensible to not do this change at this time, so feel free to close the PR if you see it that way too.
@ajoga I think the change is good, we just need to add an exception for the older distros. Should be pretty easy, but would require a little research. Would that be ok for you?
Do you want me to look into this too or are you saying we don't care? I'd hate to be the source of a backward breakage, and it may be sensible to not do this change at this time, so feel free to close the PR if you see it that way too.
@ajoga I think the change is good, we just need to add an exception for the older distros. Should be pretty easy, but would require a little research. Would that be ok for you?
I have zero golang-skills and no time to dig this, so I have to decline I'm sorry
(FYI changes in the documentation strings at https://github.com/fsnotify/fsnotify/pull/708 were adjusted by a maintainer & merged)
Thanks for all the effort @ajoga, I will take it from here.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /assign