daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-17637 vos: support to mark corruption on object/{d,a}key

Open Nasf-Fan opened this issue 7 months ago • 4 comments

Sometimes. when rebuild, we may have not enough information to rebuild some object or {d,a}key. Under such case, we need to add some flags on the target, then subsequent operation, such as read data from such bad target can properly return errno to indicate some data lost. Otherwise it will cause sclient data corruption.

For such purpose, we introduce new VOS API: vos_obj_mark_corruption(). It allows server side caller to mark the specified object or {d,a}key as corrupted. Any subsequent online operation against such bad target, except discard, will get -DER_DATA_LOSS failure.

The user/admin can use ddb (rm) to remove corrupted target to release related space.

Steps for the author:

  • [ ] Commit message follows the guidelines.
  • [ ] Appropriate Features or Test-tag pragmas were used.
  • [ ] Appropriate Functional Test Stages were run.
  • [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).

Nasf-Fan avatar Jun 20 '25 02:06 Nasf-Fan

Ticket title is 'Support to mark corrupted object or key' Status is 'In Review' https://daosio.atlassian.net/browse/DAOS-17637

github-actions[bot] avatar Jun 20 '25 02:06 github-actions[bot]

Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16529/2/testReport/

daosbuild3 avatar Jun 20 '25 03:06 daosbuild3

Test stage Functional Hardware Medium Verbs Provider MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16529/3/testReport/

daosbuild3 avatar Jun 21 '25 01:06 daosbuild3

EC_17/18 failed for DAOS-17656, not related with the patch.

Nasf-Fan avatar Jun 24 '25 06:06 Nasf-Fan

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-16529/4/testReport/

daosbuild3 avatar Jul 15 '25 06:07 daosbuild3

Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-16529/6/display/redirect

daosbuild3 avatar Jul 21 '25 03:07 daosbuild3

Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-16529/7/display/redirect

daosbuild3 avatar Jul 21 '25 07:07 daosbuild3

Test stage Test RPMs on EL 8.6 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-16529/8/display/redirect

daosbuild3 avatar Jul 21 '25 11:07 daosbuild3

All required CI tests passed.

Nasf-Fan avatar Jul 28 '25 03:07 Nasf-Fan

Ping reviewers, thanks!

Nasf-Fan avatar Jul 31 '25 03:07 Nasf-Fan