OpenSearch
OpenSearch copied to clipboard
[Remote Publication] Add remote download stats
Description
This PR introduces the remote download stats for the remote publication.
Sample stats on data node:
"cluster_state_stats": {
"overall": {
"update_count": 0,
"total_time_in_millis": 0,
"failed_count": 0
},
"remote_download": {
"success_count": 1,
"failed_count": 0,
"total_time_in_millis": 4,
"full_download": 1,
"diff_download": 0
}
}
Sample stats on master node:
"cluster_state_stats": {
"overall": {
"update_count": 3,
"total_time_in_millis": 192,
"failed_count": 0
},
"remote_upload": {
"success_count": 3,
"failed_count": 0,
"total_time_in_millis": 86,
"indices_routing_diff_files_cleanup_attempt_failed_count": 0,
"index_routing_files_cleanup_attempt_failed_count": 0,
"cleanup_attempt_failed_count": 0
},
"remote_download": {
"success_count": 0,
"failed_count": 0,
"total_time_in_millis": 0,
"full_download": 0,
"diff_download": 0
}
}
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
- [ ] Functionality includes testing.
- [ ] API changes companion pull request created, if applicable.
- [ ] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 268258a707f732492f54dff9935a43aff4043ba4: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 52aaa2f1d8a2123028c6b0ffc32a07239c307108: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 5d1f7cbb6be379aeac9cfce5bb78a656ef4fc8d8: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
"indices_routing_diff_files_cleanup_attempt_failed_count": 0, "index_routing_files_cleanup_attempt_failed_count": 0,The above stats looks out of place
Agree, these are added as part of these PR: #13909 #14684 . Should we create an issue to track this? @Bukhtawar
:x: Gradle check result for 4791e2b94ed0592cf72b68128215b0911a65e9ce: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 5994e9c1f7937addcb7f0b1750f7f802e8886966: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for f6d1b5a8a13a31318081f18acd9c91f7f63ce7fe: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 28ab45470328be21fd7a13bdb8582951fbd82fa2: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for faaece8d1d673346e328ba5c1eca81cb8bdeff74: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b8f74eedc5976da036d6d1efd494f3c0b7412401: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 0caf5d0240360b97561d14f106f4ad56ad510199: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for a361d4837f76287d5bf492ef3e7af28fdd878f1e: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 42277f617ad3e9c457b358247e1a34129d22d3bd: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for d6bca95e1313a432c55b0698383587cc98950b1d: UNSTABLE
- TEST FAILURES:
3 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
Codecov Report
Attention: Patch coverage is 69.76744% with 39 lines in your changes missing coverage. Please review.
Project coverage is 72.07%. Comparing base (
758c2aa) to head (a1a6b82). Report is 2 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #15291 +/- ##
============================================
+ Coverage 72.02% 72.07% +0.05%
- Complexity 63769 63844 +75
============================================
Files 5249 5250 +1
Lines 297795 297859 +64
Branches 43034 43038 +4
============================================
+ Hits 214480 214687 +207
+ Misses 65735 65613 -122
+ Partials 17580 17559 -21
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Looks good
:x: Gradle check result for bfd119a38e45401c5347ddb616542eb28b9f8f80: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for 0cd57fb97dcdfe405849e5098847061cc5306a95: SUCCESS
:grey_exclamation: Gradle check result for 09c8e2ab8335759120457bb5ca2ddc967bd6193c: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
How are we handling API BWC with the new stats?
In ClusterStateStats the remote upload/download stats are stored in the list of PersistedStateStats:
public void writeTo(StreamOutput out) throws IOException {
out.writeVLong(updateSuccess.get());
out.writeVLong(updateTotalTimeInMillis.get());
out.writeVLong(updateFailed.get());
out.writeVInt(persistenceStats.size());
for (PersistedStateStats stats : persistenceStats) {
stats.writeTo(out);
}
}
For download stats we have added two more elements in the list. The writeTo/readFrom are written in a way to support different length of PersistedStateStats therefore the bwc is handled for our stats.
:x: Gradle check result for e319bb2aae2d5c24dc7a0c11dd3b18263ed1e298: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
The bwc tests are failing in all PRs: https://build.ci.opensearch.org/job/gradle-check/46059/testReport/ .
Opened issue for streamlining the cleanup stats: #15556
:x: Gradle check result for 3d440bba43707d45c1363c880121bc7b0cf314d2: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:grey_exclamation: Gradle check result for 039ad00136979413685da82ddab834a9954349a8: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.
:x: Gradle check result for b9125bf3525c85a7ce98cb80f55a967789968d99: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for e6bf8db3c670e8d504b2d5ef0b3f90c14b3a6f12: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:white_check_mark: Gradle check result for a1a6b82542aa91444db359acef4fb72f1104e069: SUCCESS
The backport to 2.x failed:
The process '/usr/bin/git' failed with exit code 128
To backport manually, run these commands in your terminal:
# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15291-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 b54e867da0e313513e90872e039717b7595cf6e4
# Push it to GitHub
git push --set-upstream origin backport/backport-15291-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x
Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15291-to-2.x.