daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-10037 mgmt: Add incarnation to GroupUpdateReq

Open liw opened this issue 3 years ago • 4 comments

This is a prerequisite for implementing the following behavior as part of the DAOS-10037 design.

  • When modifying the primary group, if the incarnation of a rank increases, mark the rank and its epi as being alive.

This is necessary because of the following scenario.

  • When a rank X rejoins a system, the rank Y that is responsible for broadcasting the latest system membership to every member, including rank X, may think that rank X is dead and cancel the broadcast RPC to rank X. Hence, rank X won’t be able to initialize its local PG.

Signed-off-by: Li Wei [email protected] Required-githooks: true

liw avatar Jun 15 '22 09:06 liw

Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9371/1/execution/node/132/log

daosbuild1 avatar Jun 15 '22 09:06 daosbuild1

Test stage Unit Test completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9371/1/execution/node/703/log

daosbuild1 avatar Jun 15 '22 10:06 daosbuild1

Bug-tracker data: Ticket title is 'Use SWIM info to cancel RPCs among engines' Status is 'In Progress' https://daosio.atlassian.net/browse/DAOS-10037

github-actions[bot] avatar Aug 16 '22 02:08 github-actions[bot]

Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9371/4/execution/node/146/log

daosbuild1 avatar Aug 16 '22 02:08 daosbuild1

LGTM, this PR is adding the incarnation information into the database and a subsequent PR will use the added information.

Thanks, Tom. Just a clarification: The incarnation info is already in the database (if I'm not mistaken); this PR ships the info with GroupUpdateReq, and yes, a follow-on PR will make use the info.

liw avatar Aug 17 '22 08:08 liw

LGTM, this PR is adding the incarnation information into the database and a subsequent PR will use the added information.

Thanks, Tom. Just a clarification: The incarnation info is already in the database (if I'm not mistaken); this PR ships the info with GroupUpdateReq, and yes, a follow-on PR will make use the info.

Right sorry I meant that it was added to the GroupMap in the database, thanks for the clarification.

tanabarr avatar Aug 17 '22 08:08 tanabarr

LGTM, this PR is adding the incarnation information into the database and a subsequent PR will use the added information.

Thanks, Tom. Just a clarification: The incarnation info is already in the database (if I'm not mistaken); this PR ships the info with GroupUpdateReq, and yes, a follow-on PR will make use the info.

Right sorry I meant that it was added to the GroupMap in the database, thanks for the clarification.

Cool. If it were a database addition, we'd have an upgrade/downgrade problem. Thanks.

liw avatar Aug 17 '22 12:08 liw

Thanks for the quick responses, @tanabarr, @mjmac, and @kjacque.

liw avatar Aug 19 '22 00:08 liw