DAOS-3916 container,md: query metadata access, modify times
With this change, daos_cont_open() and daos_cont_query() return the most recent metadata access and modify times in the daos_cont_info_t output argument. The new information is returned as hybrid logical clock (HLC) values. Container operations that only read the metadata state will update the access time only, while other operations will update both access and modify times. This is envisioned as a basic mechanism for a user to identify containers not used recently (from a metadata standpoint, not an IO standpoint).
Additionally, a fix for container upgrade is implemented, since the original code only supported global version 0->1 upgrade and asserts on upgrades from layout version 1->2 and beyond.
A future patch is envisioned to provide a pool list containers interface with some filtering criteria, to find containers that may fit some user-determined criteria for migration or removal.
Changes summary
- container properties KVS includes new key/val for metadata times (using same DAOS_POOL_GLOBAL_VERSION=2 for master/release 2.4 dev).
- pool/container upgrade code changed to initialize metadata times.
- libdaos API minor version incremented (v2.3.0 -> v2.4.0)
- daos_cont_info_t.ci_pad, .ci_redun_lvl repurposed to make space for the new time fields while keeping the same structure size. (ci_redun_lvl is otherwise available through container properties).
- daos_test container tests modified to check redundancy level via property value rather than daos_cont_info_t field.
- daos utility output (both JSON and human-readable when run with -v) contains the new metadata time information.
- CaRT protocol for CONT_OPEN, CONT_OPEN_BYLABEL, and CONT_QUERY existing version (6) maintained, and new version (7) added that returns metadata access/modify times.
- engine code register and handle protocol v6 or v7 RPCs.
- client code registers and uses uses only the new/v7 protocol. (Possible future change: client query engine then register v6 or v7).
Required-githooks: true
Signed-off-by: Kenneth Cain [email protected]
Bug-tracker data: Ticket title is 'Add support for container query functionality (open/close/creation time)' Status is 'In Progress' Labels: 'Metadata' https://daosio.atlassian.net/browse/DAOS-3916
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/1/execution/node/167/log
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/2/execution/node/120/log
To be considered as discussed with @johannlombardi today: performance implications of this patch performing rdb updates of the metadata access time for every operation (especially query) i.e., can we avoid rdb_tx_update in some instances?
We want to track open times for sure, so updating (what is currently described as) "atime" in the patch is needed. We could change the implementation so that container query only looks up (not modifies) the metadata times. And likely do the same for all otherwise "read only" metadata operations. Of course any of those operations are typically preceded by a container open that will update the rdb (handle index KVS)..
Looking ahead to the subsequent patch envisioned (the preferred use case), it would be a pool list/filter containers API that would not require a container open handle, and should not require rdb updates.
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/3/execution/node/138/log
Test stage Unit Test on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/3/execution/node/575/log
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/4/execution/node/144/log
Clean test run, will open up for reviews now.
These failures from build 4 are unrelated to the patch. For the functional HW daos_test failure, I have restarted testing in that stage (now build 5) to get a clean run of that intermittently-failing test.
- checkpatch - is expected for anything that touches the cart RPC macros
- codespell - seems to be affecting everything on master. A separate PR 10031 has been landed to fix it after this PR was pushed for CI testing.
- Test Hardware / Functional Hardware Medium / POOL14: pool connect access based on ACL – FTEST_daos_test.DAOS_Pool - this seems like it could be related to the old intermittent test "dmg_helpers" code failure documented in DAOS-10301. A new ticket DAOS-11434 has been created since the code has recently been revised after fixing the first bug.
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/6/execution/node/145/log
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-9985/6/execution/node/965/log
In build 6 all tests passed. However, in Functional Hardware Medium, there is a log stating "ERROR: Detected one or more tests that failed archiving!" https://build.hpdd.intel.com/blue/rest/organizations/jenkins/pipelines/daos-stack/pipelines/daos/branches/PR-9985/runs/6/nodes/899/steps/965/log/?start=0
Unless I can find some reason this patch is responsible for the above, I think we should finish reviewing and request force landing.
@daos-stack/daos-gatekeeper when landing this PR can the text of the commit message in github be used rather than the commit message from the first push? It has been updated to be consistent with the code as it was changed during the review.