milvus
milvus copied to clipboard
enhance: optimize CPU usage for CheckHealth requests
issue: #35563
- Use an internal health checker to monitor the cluster's health state, storing the latest state on the coordinator node. The CheckHealth request retrieves the cluster's health from this latest state on the proxy sides, which enhances cluster stability.
- Each health check will assess all collections and channels, with detailed failure messages temporarily saved in the latest state.
- Use CheckHealth request instead of the heavy GetMetrics request on the querynode and datanode
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
Codecov Report
Attention: Patch coverage is 86.36364% with 54 lines in your changes missing coverage. Please review.
Project coverage is 80.92%. Comparing base (
9c8c1b3) to head (d6f6ebf). Report is 5 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #35589 +/- ##
==========================================
+ Coverage 80.89% 80.92% +0.03%
==========================================
Files 1373 1374 +1
Lines 193162 193362 +200
==========================================
+ Hits 156264 156485 +221
+ Misses 31369 31361 -8
+ Partials 5529 5516 -13
| Components | Coverage Δ | |
|---|---|---|
| Client | 74.58% <ø> (ø) |
|
| Core | 68.97% <ø> (ø) |
|
| Go | 83.02% <86.36%> (+0.03%) |
:arrow_up: |
| Files with missing lines | Coverage Δ | |
|---|---|---|
| internal/datacoord/server.go | 73.40% <100.00%> (+0.17%) |
:arrow_up: |
| internal/datacoord/services.go | 85.49% <100.00%> (+0.03%) |
:arrow_up: |
| internal/datacoord/util.go | 98.68% <100.00%> (ø) |
|
| internal/datanode/metrics_info.go | 96.20% <100.00%> (ø) |
|
| internal/datanode/services.go | 85.48% <100.00%> (+0.47%) |
:arrow_up: |
| internal/distributed/datanode/client/client.go | 89.93% <100.00%> (+0.25%) |
:arrow_up: |
| internal/distributed/datanode/service.go | 82.64% <100.00%> (+0.14%) |
:arrow_up: |
| internal/distributed/querynode/client/client.go | 91.70% <100.00%> (+0.14%) |
:arrow_up: |
| internal/distributed/querynode/service.go | 83.71% <100.00%> (+0.14%) |
:arrow_up: |
| ...nternal/flushcommon/pipeline/flow_graph_manager.go | 92.07% <100.00%> (+0.87%) |
:arrow_up: |
| ... and 19 more |
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
/run-cpu-e2e
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.
@jaime0815 go-sdk check failed, comment rerun go-sdk can trigger the job again.
@jaime0815 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.