daos
daos copied to clipboard
DAOS-17547 rebuild: error on stopped ds_pool_child
When a faulty SSD is replaced, reintegration will be auto triggered once local setup completed (ds_pool_child started).
Howerver, admin could manually run "dmg pool reintegrate" before the local setup done, then we need to return a retry-able error to make reintegration keep retry until the local ds_pool_child started.
Steps for the author:
- [ ] Commit message follows the guidelines.
- [ ] Appropriate Features or Test-tag pragmas were used.
- [ ] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Ticket title is 'Engine aborts while reintegrating an SSD that is replaced online' Status is 'In Review' Labels: 'md_on_ssd' https://daosio.atlassian.net/browse/DAOS-17547