DAOS-13205 control: Detect stale interactive check reports
Due to limitations of the checker, the user can't act on unresolved interactive findings from an older checker instance.
When a new checker instance starts:
- Remove unresolved interactive findings that will be re-discovered during the checker run (whole system or requested pool).
- For unresolved findings that won't be re-discovered (e.g. checker starts on a different pool), change the action to STALE, but continue displaying the findings in the interface.
Features: control recovery
Steps for the author:
- [x] Commit message follows the guidelines.
- [x] Appropriate Features or Test-tag pragmas were used.
- [x] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [x] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Ticket title is '"dmg check query" output stale interaction request' Status is 'In Review' Labels: 'scrubbed_2.8' https://daosio.atlassian.net/browse/DAOS-13205
Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/1/execution/node/1073/log
Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/3/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/3/execution/node/1347/log
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/4/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/4/execution/node/1373/log
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/4/execution/node/1359/log
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/8/execution/node/1234/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/8/execution/node/1415/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/9/execution/node/1316/log
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/9/execution/node/1335/log
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/11/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/11/testReport/
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/14/execution/node/1351/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/14/execution/node/1392/log
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/15/testReport/
Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/15/testReport/
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/15/execution/node/1398/log
Unrelated test failures:
- DAOS-17951 - test_snapshot_aggregation failure during pool create due to invalid rank
- DAOS-16759 - test_extend_simple SIGTERM during rebuild
- DAOS-18343 - NLT valgrind issue is a false positive with Go runtime
Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16988/17/testReport/
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/17/execution/node/1324/log
Test failures:
- NLT: Known issues with the Go runtime. These are showing up more frequently now.
- test_dangling_rank_entry: Existing issue in master: https://daosio.atlassian.net/browse/DAOS-18018
- test_lost_majority_pool_replicas: Existing issue seen in daily tests: https://daosio.atlassian.net/browse/DAOS-17788
@shimizukko I would still like your input. Please let me know if any additional coverage is needed based on the ftest changes I made. I can address test improvements in a follow-on PR after the holiday break.
There was a minor merge conflict in src/tests/ftest/recovery/check_start_options.yaml so I resolved it via merge
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16988/18/execution/node/1351/log