daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-16005 object: check resent coll_punch on leader and relay engine

Open Nasf-Fan opened this issue 1 year ago • 12 comments

For collective punch RPC handler on leader or relay engine, if related DTX has already been prepared when handling resent RPC, then we should avoid re-executing the punch locally. Otherwise, it may cleanup former prepared DTX entry by wrong.

Before requesting gatekeeper:

  • [ ] Two review approvals and any prior change requests have been resolved.
  • [ ] Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • [ ] Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • [ ] Commit messages follows the guidelines outlined here.
  • [ ] Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • [ ] You are the appropriate gatekeeper to be landing the patch.
  • [ ] The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • [ ] Githooks were used. If not, request that user install them and check copyright dates.
  • [ ] Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • [ ] All builds have passed. Check non-required builds for any new compiler warnings.
  • [ ] Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • [ ] If applicable, the PR has addressed any potential version compatibility issues.
  • [ ] Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • [ ] Extra checks if forced landing is requested
    • [ ] Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • [ ] No new NLT or valgrind warnings. Check the classic view.
    • [ ] Quick-build or Quick-functional is not used.
  • [ ] Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Nasf-Fan avatar Jun 26 '24 15:06 Nasf-Fan

Ticket title is 'aurora soak stress: Pool connect issues/pool query appear to be hanging' Status is 'In Progress' Labels: 'daos_ecb_issue,daos_ecb_scale,scrubbed_2.8,soak,triaged' https://daosio.atlassian.net/browse/DAOS-16005

github-actions[bot] avatar Jun 26 '24 15:06 github-actions[bot]

Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/1/execution/node/1208/log

daosbuild1 avatar Jun 26 '24 20:06 daosbuild1

test_ms_failover failed for DAOS-16103, to be retested.

Nasf-Fan avatar Jun 27 '24 02:06 Nasf-Fan

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14652/2/testReport/

daosbuild1 avatar Jun 27 '24 03:06 daosbuild1

Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/381/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/383/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/355/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/352/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/293/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14652/3/execution/node/289/log

daosbuild1 avatar Jun 27 '24 16:06 daosbuild1

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14652/4/testReport/

daosbuild1 avatar Jun 28 '24 03:06 daosbuild1

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14652/4/testReport/

These NLT related failures have already been fixed on the latest master.

Nasf-Fan avatar Jun 28 '24 14:06 Nasf-Fan

Replaced by https://github.com/daos-stack/daos/pull/14659 that has already been landed.

Nasf-Fan avatar Jul 23 '24 03:07 Nasf-Fan