daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-16553 client: intercept ucs_init and memset ucs global variable after fork

Open wiliamhuang opened this issue 4 months ago • 11 comments

Intercepts ucs_init(). Query the address of global variable "ucs_async_thread_global_context" in libucs inside ucs_init() and set up a fork handler to zero "ucs_async_thread_global_context" in child process after fork().

Steps for the author:

  • [ ] Commit message follows the guidelines.
  • [ ] Appropriate Features or Test-tag pragmas were used.
  • [ ] Appropriate Functional Test Stages were run.
  • [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).

wiliamhuang avatar Oct 22 '25 04:10 wiliamhuang

Ticket title is 'dfuse/pil4dfs_fio.py:Pil4dfsFio.test_pil4dfs_vs_dfs - fio timeout in release build' Status is 'In Review' Labels: '2.6.1rc1,2.6.1rc2,2.6.1rc3,2.6.2tb2,2.6.3rc1,2.6.3rc2,2.6.3rc3,2.6.3rc4,2.6.4rc1,2.6.4rc2,2.6.4rc3,2.6.4rc4,2.7.101tb,daily_test' https://daosio.atlassian.net/browse/DAOS-16553

github-actions[bot] avatar Oct 22 '25 04:10 github-actions[bot]

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17010/13/execution/node/1335/log

daosbuild3 avatar Oct 29 '25 16:10 daosbuild3

please add description to the PR

mchaarawi avatar Nov 04 '25 21:11 mchaarawi

we should also run the tests with the UCX provider, otherwise testing is useless in the PR

mchaarawi avatar Nov 04 '25 21:11 mchaarawi

we should also run the tests with the UCX provider, otherwise testing is useless in the PR

Just restarted PR with, Test-tag: test_pil4dfs_vs_dfs Test-tag-hw-medium-ucx-provider: test_pil4dfs_vs_dfs

wiliamhuang avatar Nov 04 '25 22:11 wiliamhuang

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-17010/17/testReport/

daosbuild3 avatar Nov 05 '25 08:11 daosbuild3

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17010/24/execution/node/1151/log

daosbuild3 avatar Nov 10 '25 21:11 daosbuild3

Test stage NLT on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-17010/26/display/redirect

daosbuild3 avatar Nov 15 '25 16:11 daosbuild3

Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17010/28/execution/node/1026/log

daosbuild3 avatar Nov 16 '25 05:11 daosbuild3

Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17010/28/execution/node/1296/log

daosbuild3 avatar Nov 16 '25 16:11 daosbuild3

Test stage Functional Hardware Large MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-17010/28/execution/node/1341/log

daosbuild3 avatar Nov 16 '25 16:11 daosbuild3