node-feature-discovery Re-organize security related features

What would you like to be added:

We start to have a lot of somewhat cluttered features/labels related to security and trusted-execution-environments (e.g. #647, #790, #830) and expecting more to come. I feel it would be good to group all of these more closely together, e.g. under a separate domain (say security) or under a single feature (say cpu.security).

From the users point of view this would mean renaming some of the existing labels but I think we could manage that by deprecation over a few releases.

With a separate security feature source (or domain) we could have e.g.

Old label	New label
cpu-sgx.enabled	security-cpu.sgx.enabled
cpu-se.enabled	security-cpu.se.enabled
kernel-selinux.enabled	security-kernel.selinux.enabled

With just a security feature under the cpu source we could have something like:

Old label	New label
cpu-sgx.enabled	cpu-security.sgx.enabled
cpu-se.enabled	cpu-security.se.enabled

Why is this needed:

Better organization in terms of code and (hopefully) user interface and documentation.

Jun 27 '22 13:06 marquiz

@marquiz, on the kernel side what would be part of the new security feature? SELinux and AppArmor?

Maybe I'm not actually seeing the bigger picture here, but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?

Jun 27 '22 17:06 fidencio

on the kernel side what would be part of the new security feature? SELinux and AppArmor?

Currently we only have selinux but I think apparmor would be obvious possible future addition.

Maybe I'm not actually seeing the bigger picture here

Buaha, I think I'm a bit lost here too 😸

but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?

Yup, now after sleeping over it I'm inclined to agree with you. It would seem cleaner and more natural to have security feature under each "domain". I think I'll draft a PR for comments.

Jun 28 '22 06:06 marquiz

I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under cpu-security.*.

ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL

Jun 28 '22 10:06 marquiz

I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under cpu-security.*.

ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL

LGTM. Since we are renaming things, one thing to maybe pay attention. Some the TEEs are virtual machines so we need to make sure there's no ambiguity between "the bare metal worker can run TEE VMs" (the kata use case) and "the worker node is a TEE VM".

Jul 21 '22 08:07 mythi

Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...

Aug 24 '22 10:08 eero-t

Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...

Domain might be a bit misleading here as it's the name of the "feature source" (which are arranged somewhat one per "hw or system domain"). Using dash here dates back to the early days of nfd. It might be a bit confusing that in NodeFeatureRules we use dots (like feature: cpu.cpuid) but I don't see changing this would be needed. Or if it is it's a separate issue 😄

Aug 25 '22 06:08 marquiz

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 23 '22 06:11 k8s-triage-robot

/remove-lifecycle stale

Nov 23 '22 07:11 marquiz

Moving milestone to v0.14.0. We could possibly drop the deprecated cpu-sgx.enabled and cpu-se.enabled there.

I haven't really thought should we do smth about the kernel-selinux.enabled label 🤔 One option is just to leave it as is and start using kernel-security.* if new kernel security related features are added. OTOH creating kernel-security.selinux.enabled could be something for @VillePihlava to work on 😇 It's pretty small change in terms of diffstat anyway

Feb 10 '23 08:02 marquiz

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 11 '23 08:05 k8s-triage-robot

What do you think about dropping the deprecated cpu-sgx.enabled and cpu-se.enabled in the upcoming v.014 release @mythi @fidencio @Jakob-Naucke? No problem of moving this over to v0.15

/remove-lifecycle stale

Jun 02 '23 13:06 marquiz

I've already moved to the new one for SGX so OK for me:

https://github.com/intel/intel-device-plugins-for-kubernetes/blob/5e597d2f68120724b062da83be1f8a5e4dbf1748/deployments/nfd/overlays/node-feature-rules/node-feature-rules.yaml#L96-L102

Jun 02 '23 13:06 mythi

Moving to v0.15 /milestone v0.15

We probably should spell out a deprecation policy in addition to "supported versions policy" (#1131)

Jun 22 '23 12:06 marquiz

Ref #1249

Jun 22 '23 12:06 marquiz

node-feature-discovery node-feature-discovery copied to clipboard

Re-organize security related features

node-feature-discovery
node-feature-discovery copied to clipboard