Default ownerReference for NodeFeature should be Node
What would you like to be added:
I think the default/out-of-box behavior when NFD is installed should be that the NodeFeature CRs should have their owner reference set to v1.Node object.
Why is this needed:
The rationale is basically summarized at https://ahmet.im/blog/nfd-incident/. Basically, any other alternative is worse:
- Owner is DaemonSet Pod (current default): Means your node labels are gonna get cleared during a rolling update of daemonset. No-go for a lot of installations that want to guarantee labels will always be there.
- No owner (configured via a CLI flag): Means you'll leak NodeFeature CRs (though the controller can totally clean these up during periodic resyncs if it has that logic in nfd-gc).
Parenting to v1.Node has the following advantages:
- NodeFeature resource doesn't get randomly deleted by the controller (and cause incidents like the one linked above) and its lifespan is now tied to Node itself.
- Eliminates the need for nfd-gc as Kubernetes garbage collector in kube-controller-manager would now handle the removal.
I can't think of any downsides to having a single ownerReference set to the Node object.
/assign @ozhuraki
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen