DAOS-17628 bio: flush WAL header before unmap
In bio_wal_checkpoint(), we shouldn't unmap the checkpoint-ed region before flushing WAL header (to make the last checkpoint-ed ID persistent), otherwise, if the engine is interrupted in between the unmap and flush, next WAL replay on engine start will replay from the stale checkpoint-ed ID where the WAL tx data is already cleared by unmap.
Steps for the author:
- [ ] Commit message follows the guidelines.
- [ ] Appropriate Features or Test-tag pragmas were used.
- [ ] Appropriate Functional Test Stages were run.
- [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
- [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.
After all prior steps are complete:
- [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).
Ticket title is 'CP: N0001 engines up, but pool service lost' Status is 'In Progress' Labels: 'request_for_2.6.5' https://daosio.atlassian.net/browse/DAOS-17628
Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16478/1/execution/node/1401/log
Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16478/1/execution/node/1345/log
@daos-stack/daos-gatekeeper , the prior round of tests failed hardware test due to CI infrastructure issue, I re-triggered hardware test only, and it passed hardware test this time, I think we need a force landing.