DAOS-8331 client: Export client metrics via agent
Adds new agent config parameters and code to optionally export client metrics in Prometheus format.
Example daos_agent.yml updates: telemetry_port: 9192 # export on port 9192 telemetry_retain: 5m # retain metrics for 5 minutes # after client exit
Change-Id: I77864682cc19fa4c33f326d879e20704ef57a7ea Required-githooks: true Signed-off-by: Michael MacDonald [email protected]
Bug-tracker data: Ticket title is 'Client side metrics/stats support for DAOS' Status is 'Awaiting Verification' Labels: 'HPE' https://daosio.atlassian.net/browse/DAOS-8331
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-13545/6/display/redirect
Should there be new entries added to utils/config/daos_agent.yml?
Yes, good catch. I forgot about those.
Also, could I ask for this small change? It would allow functional tests - specifically, the performance tests - to set the config 578d907 And since it's unused, no special testing is needed
I'll merge that in, thanks. Actually, I may try to add a ftest for this work, so that change makes it even easier.
Test stage Build on Leap 15.5 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/284/log
Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/279/log
Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/366/log
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/361/log
I'll merge that in, thanks. Actually, I may try to add a ftest for this work, so that change makes it even easier.
Just refreshed this patch. I did add the agent_utils_params.py changes. I have not gotten to adding the ftest yet. As these metrics are still somewhat of a WIP, IMO it's premature to add tests that are expecting fixed sets of metrics while we're iterating. I agree with @wangdi1 that we should add the ftest later.
Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/480/log
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13545/10/execution/node/347/log
Bug-tracker data: Ticket title is 'Client side metrics/stats support for DAOS' Status is 'Awaiting Verification' Labels: 'HPE' https://daosio.atlassian.net/browse/DAOS-8331
Functional on EL 9 Test Results (old)
135 tests 131 :white_check_mark: 1h 31m 39s :stopwatch: 41 suites 4 :zzz: 41 files 0 :x:
Results for commit 98945324.
:recycle: This comment has been updated with latest results.
Functional on EL 8.8 Test Results (old)
135 tests 131 :white_check_mark: 1h 29m 5s :stopwatch: 41 suites 4 :zzz: 41 files 0 :x:
Results for commit 98945324.
:recycle: This comment has been updated with latest results.
Functional Hardware Medium Test Results (old)
130 tests 104 :white_check_mark: 2h 9m 52s :stopwatch: 34 suites 26 :zzz: 34 files 0 :x:
Results for commit 98945324.
:recycle: This comment has been updated with latest results.
Functional Hardware Medium Verbs Provider Test Results (old)
55 tests 54 :white_check_mark: 4h 7m 31s :stopwatch: 7 suites 1 :zzz: 7 files 0 :x:
Results for commit 98945324.
:recycle: This comment has been updated with latest results.
Functional Hardware Large Test Results (old)
64 tests 64 :white_check_mark: 28m 42s :stopwatch: 14 suites 0 :zzz: 14 files 0 :x:
Results for commit 98945324.
:recycle: This comment has been updated with latest results.
Bug-tracker data: Ticket title is 'Client side metrics/stats support for DAOS' Status is 'Awaiting Verification' Labels: 'HPE' https://daosio.atlassian.net/browse/DAOS-8331
Requesting early reviews while waiting for the base patch to land, TIA.
Closed in favor of the approach in #14030.