daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-17338 object: optimize sgl handling to reduce iov buffer count

Open wangshilong opened this issue 10 months ago • 3 comments

This update modifies obj_sgls_dup and obj_dup_sgls_free to merge small or fragmented IOV buffers, reducing the scatter-gather list (SGL) IOV count and mitigating resource exhaustion during network bulk transfers.

  1. IOV Buffer Merging Logic: Buffers smaller than 64 bytes are now merged into a larger contiguous buffer. Sequential buffers exceeding 64 entries (each ≤4KB) are consolidated into a single buffer to minimize IOV count. Original data is copied into the merged buffer, preserving content integrity.

  2. Memory Optimization: No additional memory allocation occurs if no merging is required. For write operations, merged cases incur one memory allocation and copy. For fetch operations, two copies are performed to ensure unread regions remain unmodified (critical for CI/test validation).

Reduces SGL memory fragmentation and IOV buffer count, improving bulk transfer efficiency. Addresses edge cases where excessive IOV entries could exhaust network-layer resources.

Steps for the author:

  • [ ] Commit message follows the guidelines.
  • [ ] Appropriate Features or Test-tag pragmas were used.
  • [ ] Appropriate Functional Test Stages were run.
  • [ ] At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • [ ] Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • [ ] Gatekeeper requested (daos-gatekeeper added as a reviewer).

wangshilong avatar Apr 16 '25 14:04 wangshilong

Ticket title is 'Copy data buffer for unfriendly I/O (too fragmented, too small fragment)' Status is 'In Review' Job should run at elevated priority (1) https://daosio.atlassian.net/browse/DAOS-17338

github-actions[bot] avatar Apr 16 '25 14:04 github-actions[bot]

Test stage Functional on EL 8.8 completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos/job/PR-16268/2/display/redirect

daosbuild3 avatar Jun 03 '25 01:06 daosbuild3

Test stage Functional Hardware Large completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16268/6/execution/node/1335/log

daosbuild3 avatar Jun 25 '25 20:06 daosbuild3

Test stage Functional Hardware Large completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-16268/7/execution/node/1342/log

daosbuild3 avatar Jul 01 '25 14:07 daosbuild3