community.okd icon indicating copy to clipboard operation
community.okd copied to clipboard

OpenShif: Support Cluster Operator Info Gathering

Open stratus-ss opened this issue 10 months ago • 3 comments

Good day,

I opened an issue with the kubernetes.core which requested to add a feature that will process cluster operators in order to determine the health of a cluster.

They suggested the possibility of introducing that feature here. However, it would be almost a reimplementation of the k8s_info I think. I have a POC of the kubernetes.core in my fork without any integration tests.

I am willing to take on the work to do the PR, but I will need some guidance as to how best to implement that here.

Thanks

stratus-ss avatar Feb 03 '25 21:02 stratus-ss

We had a similar issue with the raw k8s module, our solution was to put some of the meatier functionality into functions in kubernetes.k8s and import/use them in openshift.k8s (https://github.com/openshift/community.okd/blob/main/plugins/module_utils/k8s.py#L16). Unfortunately there's still plenty of duplication. You could definitely just pull in the file, or you could make a more specific clusteroperator module. With the limited logic in the k8s_info module it may make more sense to just copy it rather than trying to find a cleaner way around that.

fabianvf avatar Feb 04 '25 16:02 fabianvf

I want to revisit this.

The k8s module has been working through a general Cluster Operator healthy statement but it doesn't help you know what the status of the individual operators are. In the event that the wait is exceeded there is still a need to know which ones are failing.

I am happy to work through this, but it goes back to, do we re-implement the k8s_info() here?

stratus-ss avatar Feb 13 '25 01:02 stratus-ss

@fabianvf still interested in this

stratus-ss avatar Mar 25 '25 13:03 stratus-ss

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot avatar Jun 24 '25 01:06 openshift-bot

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

openshift-bot avatar Jul 24 '25 08:07 openshift-bot

I'm still interested in pushing this forward

stratus-ss avatar Jul 28 '25 14:07 stratus-ss

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-bot avatar Aug 28 '25 00:08 openshift-bot

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci[bot] avatar Aug 28 '25 00:08 openshift-ci[bot]