api icon indicating copy to clipboard operation
api copied to clipboard

OCPNODE-2400: Add Conditions field to the config.node object's status

Open sairameshv opened this issue 1 year ago • 13 comments

Based on the discussion here it is necessary to add a note about the deprecation of the Cgroupsv1 usage via the nodes.config object's status field.

Hence added the "Conditions" field to the config.node object's status.

/cc @harche @rphillips

sairameshv avatar Jul 01 '24 16:07 sairameshv

@sairameshv: This pull request references OCPNODE-2400 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Based on the discussion here it is necessary to add a note about the deprecation of the Cgroupsv1 functionality via the nodes.config object's status field.

Hence added the "Conditions" field to the config.node object's status.

/cc @harche @rphillips

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Jul 01 '24 16:07 openshift-ci-robot

Hello @sairameshv! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

openshift-ci[bot] avatar Jul 01 '24 16:07 openshift-ci[bot]

@sairameshv: This pull request references OCPNODE-2400 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Based on the discussion here it is necessary to add a note about the deprecation of the Cgroupsv1 usage via the nodes.config object's status field.

Hence added the "Conditions" field to the config.node object's status.

/cc @harche @rphillips

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot avatar Jul 01 '24 16:07 openshift-ci-robot

/retest

sairameshv avatar Jul 02 '24 01:07 sairameshv

/cc @deads2k

harche avatar Jul 02 '24 13:07 harche

I was wondering instead of using Conditions directly, should have something like NodeConditions (like we tried this once with workerLatencyProfiles)?

/cc @deads2k @rphillips

harche avatar Jul 02 '24 13:07 harche

I was wondering instead of using Conditions directly, should have something like NodeConditions (like we tried this once with workerLatencyProfiles)?

/cc @deads2k @rphillips

We have the field NodeStatus already and the Conditions is a generic term. Do we have to still rename it to NodeConditions ?

sairameshv avatar Jul 02 '24 14:07 sairameshv

In case of worker latency profiles, we thought of adding a latency profile specific status in NodeStatus. Since nodes.config object deals with various node related features, I thought maybe if we have a status dedicated for each feature it might a good place to keep everything organized.

In this case you are adding NodeStatus to show cgroup v1 deprecation notice, so many be a simple condition is good enough (although you can still have CgroupDeprecationStatus) but for more complicated features (e.g. WorkerLatencyProfiles) having a dedicated status for each feature makes sense to me.

harche avatar Jul 02 '24 14:07 harche

I think if we find utility in node conditions generally, having a generic one inside the node status makes sense. if we have a cgroup specific one, then we'd probably deprecate the field when cgroupv1 was finally dropped. We wouldn't need to if there are uses for the generic field

haircommander avatar Jul 24 '24 13:07 haircommander

if we have a cgroup specific one, then we'd probably deprecate the field when cgroupv1 was finally dropped

This will unlikely to get deprecated. Not sure why would we want to deprecate this node status field that warns the user of a deprecated feature (cgroup v1). We would always want to warn user if they ever find themselves using cgroup v1, more so if the cgroup v1 is deprecated as a feature.

harche avatar Jul 24 '24 13:07 harche

fair enough. I still think generic conditions could be useful

haircommander avatar Jul 24 '24 14:07 haircommander

/retest

sairameshv avatar Aug 01 '24 18:08 sairameshv

The config's node object is a singleton for the entire cluster. Does it make sense to add a condition here for N nodes? A better spot would be on the K8S Node object.

rphillips avatar Aug 01 '24 21:08 rphillips

@rphillips OCP is deprecating cgroup v1 support, upstream k8s is just going to keep it in maintenance mode. This PR is to allow us display deprecation notice if the OCP cluster is set to use cgroup v1. Also in case of OCP, our cgroup version settings are cluster-wide (managed via config.node object), so it makes sense to update the status of the object driving the cgroup version change.

harche avatar Aug 02 '24 13:08 harche

/lgtm

rphillips avatar Aug 08 '24 22:08 rphillips

@sairameshv Changes look ok, we are generally asking people to use feature gates for new features though, are you able to gate this addition? What's the implementation state on the operator side look like right now?

JoelSpeed avatar Aug 14 '24 12:08 JoelSpeed

@sairameshv Changes look ok, we are generally asking people to use feature gates for new features though, are you able to gate this addition? What's the implementation state on the operator side look like right now?

@JoelSpeed As mentioned in the description, we intend to add a note/message in the config node object's status field about the deprecation of the usage of cgroupsv1 mode if the cluster is using cgroupsv1. There is no functionality change and hence I don't think a featuregate is needed here. Do you think otherwise?

cc: @harche

sairameshv avatar Aug 14 '24 13:08 sairameshv

/lgtm

I'm ok with that explanation

JoelSpeed avatar Aug 14 '24 13:08 JoelSpeed

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed, rphillips, sairameshv

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci[bot] avatar Aug 14 '24 13:08 openshift-ci[bot]

/retest-required

Remaining retests: 0 against base HEAD b3ee44d52653ae79180d516f5f59a90d29c52786 and 2 for PR HEAD 6f0975a2eb0298370c9b483b7588fbd564e86f53 in total

openshift-ci-robot avatar Aug 14 '24 13:08 openshift-ci-robot

/retest-required

Remaining retests: 0 against base HEAD 3f284e60eb8ca3e3aeb61ec5e388665876c15f3a and 1 for PR HEAD 6f0975a2eb0298370c9b483b7588fbd564e86f53 in total

openshift-ci-robot avatar Aug 14 '24 15:08 openshift-ci-robot

/retest

sairameshv avatar Aug 15 '24 01:08 sairameshv

/retest

sairameshv avatar Aug 15 '24 04:08 sairameshv

/override ci/prow/e2e-upgrade-minor

Perma broken right now, am working on a fix

JoelSpeed avatar Aug 15 '24 07:08 JoelSpeed

@JoelSpeed: Overrode contexts on behalf of JoelSpeed: ci/prow/e2e-upgrade-minor

In response to this:

/override ci/prow/e2e-upgrade-minor

Perma broken right now, am working on a fix

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci[bot] avatar Aug 15 '24 07:08 openshift-ci[bot]

/retest-required

Remaining retests: 0 against base HEAD b1f8d00b581844cb041faccccb5172cac3e299b4 and 0 for PR HEAD 6f0975a2eb0298370c9b483b7588fbd564e86f53 in total

openshift-ci-robot avatar Aug 15 '24 17:08 openshift-ci-robot

/hold

Revision 6f0975a2eb0298370c9b483b7588fbd564e86f53 was retested 3 times: holding

openshift-ci-robot avatar Aug 15 '24 21:08 openshift-ci-robot

/retest-required

JoelSpeed avatar Aug 16 '24 09:08 JoelSpeed

I don't know if this hold label gets removed by the bot, @JoelSpeed could you help in getting this merged?

sairameshv avatar Aug 19 '24 12:08 sairameshv

/hold cancel

JoelSpeed avatar Aug 21 '24 16:08 JoelSpeed