MGMT-17826: Disable LocalClusterImportController for non ACM/MCE
When installed outside of an ACM environment, the Infrastructure Operator has a panic. This is because the ManagedCluster CRD, which is required by the LocalClusterImportController feature is not present.
This PR checks for the presence of that CRD, if the CRD is not found then a condition is set to indicate the reason and reconciliation ends.
List all the issues related to this PR
- [ ] New Feature
- [ ] Enhancement
- [x] Bug fix
- [x] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [x] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [x] None
How was this code tested?
- [ ] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [x] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [x] No tests needed
Checklist
- [x] Title and description added to both, commit and PR.
- [x] Relevant issues have been associated (see CONTRIBUTING guide)
- [x] This change does not require a documentation update (docstring,
docs, README, etc) - [x] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
@paul-maidment: This pull request references MGMT-17826 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.16.0" version, but no target version was set.
In response to this:
When installed outside of an ACM environment, the Infrastructure Operator has a panic. This is because the ManagedCluster CRD, which is required by the LocalClusterImportController feature is not present.
This PR checks for the presence of that CRD, if the CRD is not found then a condition is set to indicate the reason and reconciliation ends.
List all the issues related to this PR
- [ ] New Feature
- [ ] Enhancement
- [x] Bug fix
- [x] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [x] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [x] None
How was this code tested?
- [ ] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [x] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [x] No tests needed
Checklist
- [x] Title and description added to both, commit and PR.
- [x] Relevant issues have been associated (see CONTRIBUTING guide)
- [x] This change does not require a documentation update (docstring,
docs, README, etc)- [x] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: paul-maidment
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [paul-maidment]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 68.41%. Comparing base (
6e2795d) to head (b95c0ee).
:exclamation: Current head b95c0ee differs from pull request most recent head c04616a
Please upload reports for the commit c04616a to get more accurate results.
Additional details and impacted files
@@ Coverage Diff @@
## master #6325 +/- ##
==========================================
- Coverage 68.43% 68.41% -0.02%
==========================================
Files 247 247
Lines 36169 36169
==========================================
- Hits 24751 24746 -5
- Misses 9231 9233 +2
- Partials 2187 2190 +3
/retest
/retest
/cc @avishayt @ori-amizur please review
@gamli75: GitHub didn't allow me to request PR reviews from the following users: please, review.
Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.
In response to this:
/cc @avishayt @ori-amizur please review
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/cc @avishayt
/cc @ori-amizur
As per our discussion and some further testing. It's more beneficial to us to leave things the way they are. I'm going to close this PR as per that discussion.
Re-opening as I have discovered that the infrastructure operator will not function correctly unless we handle this issue.
@paul-maidment: This pull request references MGMT-17826 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set.
In response to this:
When installed outside of an ACM environment, the Infrastructure Operator has a panic. This is because the ManagedCluster CRD, which is required by the LocalClusterImportController feature is not present.
This PR checks for the presence of that CRD, if the CRD is not found then a condition is set to indicate the reason and reconciliation ends.
List all the issues related to this PR
- [ ] New Feature
- [ ] Enhancement
- [x] Bug fix
- [x] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [x] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [x] None
How was this code tested?
- [ ] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [x] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [x] No tests needed
Checklist
- [x] Title and description added to both, commit and PR.
- [x] Relevant issues have been associated (see CONTRIBUTING guide)
- [x] This change does not require a documentation update (docstring,
docs, README, etc)- [x] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@paul-maidment: This pull request references MGMT-17826 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set.
In response to this:
When installed outside of an ACM environment, the Infrastructure Operator has a panic. This is because the ManagedCluster CRD, which is required by the LocalClusterImportController feature is not present.
This PR prevents the operator from being started if the ManagedCluster CRD is not present. With a message that makes it clear to the user why this occurred.
List all the issues related to this PR
- [ ] New Feature
- [ ] Enhancement
- [x] Bug fix
- [x] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [x] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [x] None
How was this code tested?
- [ ] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [x] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [x] No tests needed
Checklist
- [x] Title and description added to both, commit and PR.
- [x] Relevant issues have been associated (see CONTRIBUTING guide)
- [x] This change does not require a documentation update (docstring,
docs, README, etc)- [x] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
as I have discovered that the infrastructure operator will not function correctly unless we handle this issue
Can you elaborate a bit more? Ideally we'd have the failing behavior explained in the bug and also here in more detail.
While I think we should consider this, I'm still split.
All customers would install the operator through MCE and thus would definitely have ManagedCluster, but we also publish our operator in the community operators catalog to install standalone. This change would break that workflow.
That said, we do the same thing for BMH and treating the two resources the same would make sense. But on the other hand booting using a BMH is a much more common use case than managing the local cluster with assisted.
@avishayt what do you think?
as I have discovered that the infrastructure operator will not function correctly unless we handle this issue
Can you elaborate a bit more? Ideally we'd have the failing behavior explained in the bug and also here in more detail.
It is effectively a crash as soon as an attempt is made to list Managed cluster resources when the definition is not present.
So in a way it is very similar to the behaviour I have placed in this PR, with the exception that the code in this PR makes it very clear why the error occurs.
/retest
While I think we should consider this, I'm still split.
All customers would install the operator through MCE and thus would definitely have ManagedCluster, but we also publish our operator in the community operators catalog to install standalone. This change would break that workflow.
That said, we do the same thing for BMH and treating the two resources the same would make sense. But on the other hand booting using a BMH is a much more common use case than managing the local cluster with assisted.
@avishayt what do you think?
I think my original proposal to terminate the LocalClusterImportController reconcile loop early in the case of Managed cluster CRD not present is reasonable.
It prevents a hard crash while allowing the rest of the operator to be functional.
After creation of the Managed cluster CRD. The operator is able to start (though the user will need to force a reconcile)
@paul-maidment: This pull request references MGMT-17826 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set.
In response to this:
When the ManagedCluster CRD is not present during installation of the infrastructure operator, a failure is encountered
"failed to get API group resources: unable to retrieve the complete list of server APIs: cluster.open-cluster-management.io/v1: the server could not find the requested resource"This results in a non functional operator.
This PR aims to address that by ensuring that we do not complete the reconcile loop (and therefore perfom the check for the presence of a ManagedCluster CR) if the CRD is not present.
List all the issues related to this PR
- [ ] New Feature
- [ ] Enhancement
- [x] Bug fix
- [x] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [x] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [x] None
How was this code tested?
- [ ] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [x] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [x] No tests needed
Checklist
- [x] Title and description added to both, commit and PR.
- [x] Relevant issues have been associated (see CONTRIBUTING guide)
- [x] This change does not require a documentation update (docstring,
docs, README, etc)- [x] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.