daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-15794 tools: Add --health-only flag to (dmg|daos) pool query

Open mjmac opened this issue 1 year ago • 12 comments

As a convenience, provide a "streamlined" version of the pool query that only performs the minimum amount of work to query the pool's health. Practically speaking, this means that it will query for disabled ranks and omit the space query, which is expensive.

Features: pool Required-githooks: true Change-Id: I5abe26c2a9a449a9d7c9c0867ae1fff1de9685d5 Signed-off-by: Michael MacDonald [email protected]

mjmac avatar May 02 '24 20:05 mjmac

Ticket title is 'Add --health-only flag to (dmg|daos) pool query commands' Status is 'In Review' https://daosio.atlassian.net/browse/DAOS-15794

github-actions[bot] avatar May 02 '24 20:05 github-actions[bot]

@kccain, @knard-intel: No rush, but when you have some time, I'd appreciate some early feedback on the approach here. Pinging you guys because you've both worked on the pool query stuff.

The changes in this PR work nicely in my local testing; we'll see how it does in CI. Code-wise, I think this is much cleaner and continues the work of providing a proper Go API for libdaos. The main thing that I'm not sure about is adding the new query bit for disabled ranks. My thinking is that there seems to be a desire to support querying for both, eventually. It also is much more flexible and future-proof to send a query options bitmask instead of expanding protobuf messages with more fields in order to implement new query types.

mjmac avatar May 02 '24 20:05 mjmac

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14297/1/testReport/

daosbuild1 avatar May 02 '24 20:05 daosbuild1

Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14297/1/testReport/

daosbuild1 avatar May 02 '24 21:05 daosbuild1

Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14297/2/testReport/

daosbuild1 avatar May 02 '24 22:05 daosbuild1

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14297/4/execution/node/1409/log

daosbuild1 avatar May 03 '24 18:05 daosbuild1

Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14297/5/execution/node/1180/log

daosbuild1 avatar May 05 '24 06:05 daosbuild1

Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14297/6/execution/node/1152/log

daosbuild1 avatar May 06 '24 11:05 daosbuild1

FYI - #14317 should fix the heap of errors in https://build.hpdd.intel.com/blue/organizations/jenkins/daos-stack%2Fdaos/detail/PR-14297/6/pipeline/

daltonbohning avatar May 06 '24 16:05 daltonbohning

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14297/7/execution/node/1454/log

daosbuild1 avatar May 08 '24 01:05 daosbuild1

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14297/10/execution/node/1409/log

daosbuild1 avatar May 09 '24 15:05 daosbuild1

Last force-push to rebase on master. Will open for formal reviews once the PR passes basic testing again. The last test run with Features: pool passed all tests successfully except for pool/verify_space.py. Looking at the test log, it appears that the test succeeded but then failed in teardown while trying to destroy a container. Might be a reoccurrence of DAOS-15798, but with container 6 instead of 5? Dunno, but reasonably certain it's not due to this change.

mjmac avatar May 09 '24 18:05 mjmac