daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-17931 engine: Terminate engine process upon receipt of SIGBUS signal.

Open jgmoore-or opened this issue 2 months ago • 2 comments

Signal handling in engine's main function is modified to include SIGBUS in the wait set. On occurrence, an error is logged, the engine shutdown is invoked (server_fini), and the process exits with failure status.

Testing has confirmed that, following the termination, the rank is properly marked as dead and the rebuild process for it is instigated.

Steps for the author:

  • [ ] Commit message follows the guidelines.
  • [ ] Appropriate Features or Test-tag pragmas were used.
  • [ ] Appropriate Functional Test Stages were run.
  • [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).

jgmoore-or avatar Dec 12 '25 22:12 jgmoore-or

Errors are Unable to load ticket data https://daosio.atlassian.net/browse/DAOS-17931

github-actions[bot] avatar Dec 12 '25 22:12 github-actions[bot]

Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17268/2/display/redirect

daosbuild3 avatar Dec 12 '25 22:12 daosbuild3