DAOS-16895 control: Show pool in degraded state only when rebuild busy
The pool state of “Degraded” is easily misinterpreted as meaning “not perfect data protection/rebuild not completed”, which is its typical meaning in storage environemnts. What it really means here (at least in the most typical scenario) is that some targets are excluded. Rebuild state is tracked in another property.
As an immediate step, fix this by setting/displaying the pool state as “Degraded” only when rebuild is active.
Features: control
Steps for the author:
- [ ] Commit message follows the guidelines.
- [ ] Appropriate Features or Test-tag pragmas were used.
- [ ] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Ticket title is 'Rename pool state "Degraded" to "TargetsExcluded"' Status is 'In Progress' Labels: 'LRZ' https://daosio.atlassian.net/browse/DAOS-16895
Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16497/2/testReport/
Updated PR to change state from Degraded to TargetsExcluded to prevent conveying mis-information.
@daltonbohning @phender regarding test references to "Degraded" I think I've updated the relevant string references but do you think we need to change any of the test names with regard to this change? e.g. DAOS_Degraded_Mode or daos_degraded.c
@daltonbohning @phender regarding test references to "Degraded" I think I've updated the relevant string references but do you think we need to change any of the test names with regard to this change? e.g. DAOS_Degraded_Mode or daos_degraded.c
I think that depends on whether Degraded still makes sense in general to the devs writing those tests.
Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16497/3/execution/node/1123/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16497/4/execution/node/754/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16497/5/execution/node/441/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16497/6/execution/node/441/log
CI runs passed apart from 5 unrelated EC failures which are presumably intermittent issues.
CI run nr 8 passed so requesting landing...