node-feature-discovery
node-feature-discovery copied to clipboard
Re-organize security related features
What would you like to be added:
We start to have a lot of somewhat cluttered features/labels related to security and trusted-execution-environments (e.g. #647, #790, #830) and expecting more to come. I feel it would be good to group all of these more closely together, e.g. under a separate domain (say security) or under a single feature (say cpu.security).
From the users point of view this would mean renaming some of the existing labels but I think we could manage that by deprecation over a few releases.
With a separate security feature source (or domain) we could have e.g.
| Old label | New label |
|---|---|
| cpu-sgx.enabled | security-cpu.sgx.enabled |
| cpu-se.enabled | security-cpu.se.enabled |
| kernel-selinux.enabled | security-kernel.selinux.enabled |
With just a security feature under the cpu source we could have something like:
| Old label | New label |
|---|---|
| cpu-sgx.enabled | cpu-security.sgx.enabled |
| cpu-se.enabled | cpu-security.se.enabled |
Why is this needed:
Better organization in terms of code and (hopefully) user interface and documentation.
@marquiz, on the kernel side what would be part of the new security feature? SELinux and AppArmor?
Maybe I'm not actually seeing the bigger picture here, but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?
on the kernel side what would be part of the new
securityfeature? SELinux and AppArmor?
Currently we only have selinux but I think apparmor would be obvious possible future addition.
Maybe I'm not actually seeing the bigger picture here
Buaha, I think I'm a bit lost here too 😸
but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?
Yup, now after sleeping over it I'm inclined to agree with you. It would seem cleaner and more natural to have security feature under each "domain". I think I'll draft a PR for comments.
I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under cpu-security.*.
ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL
I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under
cpu-security.*.ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL
LGTM. Since we are renaming things, one thing to maybe pay attention. Some the TEEs are virtual machines so we need to make sure there's no ambiguity between "the bare metal worker can run TEE VMs" (the kata use case) and "the worker node is a TEE VM".
Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...
Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...
Domain might be a bit misleading here as it's the name of the "feature source" (which are arranged somewhat one per "hw or system domain"). Using dash here dates back to the early days of nfd. It might be a bit confusing that in NodeFeatureRules we use dots (like feature: cpu.cpuid) but I don't see changing this would be needed. Or if it is it's a separate issue 😄
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Moving milestone to v0.14.0. We could possibly drop the deprecated cpu-sgx.enabled and cpu-se.enabled there.
I haven't really thought should we do smth about the kernel-selinux.enabled label 🤔 One option is just to leave it as is and start using kernel-security.* if new kernel security related features are added. OTOH creating kernel-security.selinux.enabled could be something for @VillePihlava to work on 😇 It's pretty small change in terms of diffstat anyway
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
What do you think about dropping the deprecated cpu-sgx.enabled and cpu-se.enabled in the upcoming v.014 release @mythi @fidencio @Jakob-Naucke? No problem of moving this over to v0.15
/remove-lifecycle stale
I've already moved to the new one for SGX so OK for me:
https://github.com/intel/intel-device-plugins-for-kubernetes/blob/5e597d2f68120724b062da83be1f8a5e4dbf1748/deployments/nfd/overlays/node-feature-rules/node-feature-rules.yaml#L96-L102
Moving to v0.15 /milestone v0.15
We probably should spell out a deprecation policy in addition to "supported versions policy" (#1131)
Ref #1249