node-feature-discovery icon indicating copy to clipboard operation
node-feature-discovery copied to clipboard

Re-organize security related features

Open marquiz opened this issue 3 years ago • 6 comments

What would you like to be added:

We start to have a lot of somewhat cluttered features/labels related to security and trusted-execution-environments (e.g. #647, #790, #830) and expecting more to come. I feel it would be good to group all of these more closely together, e.g. under a separate domain (say security) or under a single feature (say cpu.security).

From the users point of view this would mean renaming some of the existing labels but I think we could manage that by deprecation over a few releases.

With a separate security feature source (or domain) we could have e.g.

Old label New label
cpu-sgx.enabled security-cpu.sgx.enabled
cpu-se.enabled security-cpu.se.enabled
kernel-selinux.enabled security-kernel.selinux.enabled

With just a security feature under the cpu source we could have something like:

Old label New label
cpu-sgx.enabled cpu-security.sgx.enabled
cpu-se.enabled cpu-security.se.enabled

Why is this needed:

Better organization in terms of code and (hopefully) user interface and documentation.

marquiz avatar Jun 27 '22 13:06 marquiz

@marquiz, on the kernel side what would be part of the new security feature? SELinux and AppArmor?

Maybe I'm not actually seeing the bigger picture here, but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?

fidencio avatar Jun 27 '22 17:06 fidencio

on the kernel side what would be part of the new security feature? SELinux and AppArmor?

Currently we only have selinux but I think apparmor would be obvious possible future addition.

Maybe I'm not actually seeing the bigger picture here

Buaha, I think I'm a bit lost here too 😸

but it seems that with a security feature under cpu things would be cleaner. But then I also see this growing to having a security feature under each component that it may apply, which could only be cpu and kernel?

Yup, now after sleeping over it I'm inclined to agree with you. It would seem cleaner and more natural to have security feature under each "domain". I think I'll draft a PR for comments.

marquiz avatar Jun 28 '22 06:06 marquiz

I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under cpu-security.*.

ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL

marquiz avatar Jun 28 '22 10:06 marquiz

I dropped a quick PR for comments #833 re-organizing the CPU security features. This version does not introduce a new feature source but consolidates all cpu security features under cpu-security.*.

ping @zvonkok @Jakob-Naucke @fidencio @mythi PTAL

LGTM. Since we are renaming things, one thing to maybe pay attention. Some the TEEs are virtual machines so we need to make sure there's no ambiguity between "the bare metal worker can run TEE VMs" (the kata use case) and "the worker node is a TEE VM".

mythi avatar Jul 21 '22 08:07 mythi

Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...

eero-t avatar Aug 24 '22 10:08 eero-t

Why "cpu-security" instead of "cpu.security"? If it's a subdomain, I think it should be separated by a dot...

Domain might be a bit misleading here as it's the name of the "feature source" (which are arranged somewhat one per "hw or system domain"). Using dash here dates back to the early days of nfd. It might be a bit confusing that in NodeFeatureRules we use dots (like feature: cpu.cpuid) but I don't see changing this would be needed. Or if it is it's a separate issue 😄

marquiz avatar Aug 25 '22 06:08 marquiz

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 23 '22 06:11 k8s-triage-robot

/remove-lifecycle stale

marquiz avatar Nov 23 '22 07:11 marquiz

Moving milestone to v0.14.0. We could possibly drop the deprecated cpu-sgx.enabled and cpu-se.enabled there.

I haven't really thought should we do smth about the kernel-selinux.enabled label 🤔 One option is just to leave it as is and start using kernel-security.* if new kernel security related features are added. OTOH creating kernel-security.selinux.enabled could be something for @VillePihlava to work on 😇 It's pretty small change in terms of diffstat anyway

marquiz avatar Feb 10 '23 08:02 marquiz

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 11 '23 08:05 k8s-triage-robot

What do you think about dropping the deprecated cpu-sgx.enabled and cpu-se.enabled in the upcoming v.014 release @mythi @fidencio @Jakob-Naucke? No problem of moving this over to v0.15

/remove-lifecycle stale

marquiz avatar Jun 02 '23 13:06 marquiz

I've already moved to the new one for SGX so OK for me:

https://github.com/intel/intel-device-plugins-for-kubernetes/blob/5e597d2f68120724b062da83be1f8a5e4dbf1748/deployments/nfd/overlays/node-feature-rules/node-feature-rules.yaml#L96-L102

mythi avatar Jun 02 '23 13:06 mythi

Moving to v0.15 /milestone v0.15

We probably should spell out a deprecation policy in addition to "supported versions policy" (#1131)

marquiz avatar Jun 22 '23 12:06 marquiz

Ref #1249

marquiz avatar Jun 22 '23 12:06 marquiz