operator-controller
operator-controller copied to clipboard
:bug: add PDB to make sure at least 1 pod is always available during upgrade
Description
To address OCPBUGS-62517. Currently, the operator-controller lacks PodDisruptionBudget configuration. During node drain operations or cluster upgrades, all controller pods can be evicted
Simultaneously, causing the operator to report Available=False, which violates the OpenShift API contract:
"A component must not report Available=False during the course of a normal upgrade." — OpenShift API Contract
Add PodDisruptionBudget resources with minAvailable: 1 for both controllers to ensure at least one pod remains available during:
- Rolling updates
- Node drain operations
- Cluster upgrades
Reviewer Checklist
- [ ] API Go Documentation
- [ ] Tests: Unit Tests (and E2E Tests, if appropriate)
- [x] Comprehensive Commit Messages
- [ ] Links to related GitHub Issue(s)
Assisted-by: Claude code
Deploy Preview for olmv1 ready!
| Name | Link |
|---|---|
| Latest commit | c217a17418679753874760abe4379e7913185346 |
| Latest deploy log | https://app.netlify.com/projects/olmv1/deploys/6927a06932c198000845676b |
| Deploy Preview | https://deploy-preview-2362--olmv1.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 74.39%. Comparing base (0fecf3f) to head (c217a17).
:warning: Report is 2 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2362 +/- ##
==========================================
+ Coverage 70.50% 74.39% +3.89%
==========================================
Files 93 93
Lines 7300 7300
==========================================
+ Hits 5147 5431 +284
+ Misses 1719 1435 -284
Partials 434 434
| Flag | Coverage Δ | |
|---|---|---|
| e2e | 44.51% <ø> (ø) |
|
| experimental-e2e | 48.72% <ø> (?) |
|
| unit | 58.47% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
/approve It's needed, just may need a few tweeks.
/retest e2e / experimental-e2e (pull_request)
@jianzhangbjz: No presubmit jobs available for operator-framework/operator-controller@main
In response to this:
/retest e2e / experimental-e2e (pull_request)
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/retest experimental-e2e
@jianzhangbjz: No presubmit jobs available for operator-framework/operator-controller@main
In response to this:
/retest experimental-e2e
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: tmshort
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~hack/OWNERS~~ [tmshort]
- ~~helm/OWNERS~~ [tmshort]
- ~~manifests/OWNERS~~ [tmshort]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Updated the hack/test/install-prometheus.sh to add timeout to address the below error:
Waiting for Prometheus Operator pod to become ready...
error: no matching resources found
Cleaning up /tmp/tmp.Gv71SsiUrG
make: *** [Makefile:295: prometheus] Error 1
Error: Process completed with exit code 2.
/lgtm