node-feature-discovery icon indicating copy to clipboard operation
node-feature-discovery copied to clipboard

Default ownerReference for NodeFeature should be Node

Open ahmetb opened this issue 10 months ago • 3 comments

What would you like to be added:

I think the default/out-of-box behavior when NFD is installed should be that the NodeFeature CRs should have their owner reference set to v1.Node object.

Why is this needed:

The rationale is basically summarized at https://ahmet.im/blog/nfd-incident/. Basically, any other alternative is worse:

  • Owner is DaemonSet Pod (current default): Means your node labels are gonna get cleared during a rolling update of daemonset. No-go for a lot of installations that want to guarantee labels will always be there.
  • No owner (configured via a CLI flag): Means you'll leak NodeFeature CRs (though the controller can totally clean these up during periodic resyncs if it has that logic in nfd-gc).

Parenting to v1.Node has the following advantages:

  • NodeFeature resource doesn't get randomly deleted by the controller (and cause incidents like the one linked above) and its lifespan is now tied to Node itself.
  • Eliminates the need for nfd-gc as Kubernetes garbage collector in kube-controller-manager would now handle the removal.

I can't think of any downsides to having a single ownerReference set to the Node object.

ahmetb avatar Feb 03 '25 22:02 ahmetb

/assign @ozhuraki

ozhuraki avatar Feb 04 '25 18:02 ozhuraki

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar May 05 '25 19:05 k8s-triage-robot

/lifecycle frozen

ahmetb avatar May 05 '25 20:05 ahmetb