HBASE-27316 Time based metrics will be reset after any get request
Jmx metrics can be query by http request.
But metrics will be reset after any get request.
The root cause may be the implement of Histogram's method "snapshot"
public Snapshot snapshot() { return histogram.snapshotAndReset(); }
It will call snapshot and reset at the same time.
I think it should not be reset cause we may need history metrics.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 1m 8s | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 33s | master passed |
| +1 :green_heart: | compile | 0m 11s | master passed |
| +1 :green_heart: | shadedjars | 3m 49s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 12s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 19s | the patch passed |
| +1 :green_heart: | compile | 0m 11s | the patch passed |
| +1 :green_heart: | javac | 0m 11s | the patch passed |
| +1 :green_heart: | shadedjars | 3m 45s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 10s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 0m 16s | hbase-metrics in the patch failed. |
| 15m 40s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/4719 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux 4d3552af81d5 5.4.0-122-generic #138-Ubuntu SMP Wed Jun 22 15:00:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 950ad8dd3e |
| Default Java | AdoptOpenJDK-1.8.0_282-b08 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-metrics.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/testReport/ |
| Max. process+thread count | 154 (vs. ulimit of 30000) |
| modules | C: hbase-metrics U: hbase-metrics |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/console |
| versions | git=2.17.1 maven=3.6.3 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:broken_heart: -1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 0m 40s | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck |
| _ Prechecks _ | |||
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 3m 3s | master passed |
| +1 :green_heart: | compile | 0m 10s | master passed |
| +1 :green_heart: | shadedjars | 4m 2s | branch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 11s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 42s | the patch passed |
| +1 :green_heart: | compile | 0m 10s | the patch passed |
| +1 :green_heart: | javac | 0m 10s | the patch passed |
| +1 :green_heart: | shadedjars | 4m 1s | patch has no errors when building our shaded downstream artifacts. |
| +1 :green_heart: | javadoc | 0m 10s | the patch passed |
| _ Other Tests _ | |||
| -1 :x: | unit | 0m 17s | hbase-metrics in the patch failed. |
| 16m 26s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/4719 |
| Optional Tests | javac javadoc unit shadedjars compile |
| uname | Linux 2a1a76c592b8 5.4.0-1081-aws #88~18.04.1-Ubuntu SMP Thu Jun 23 16:29:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 950ad8dd3e |
| Default Java | AdoptOpenJDK-11.0.10+9 |
| unit | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-metrics.txt |
| Test Results | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/testReport/ |
| Max. process+thread count | 156 (vs. ulimit of 30000) |
| modules | C: hbase-metrics U: hbase-metrics |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/console |
| versions | git=2.17.1 maven=3.6.3 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
:confetti_ball: +1 overall
| Vote | Subsystem | Runtime | Comment |
|---|---|---|---|
| +0 :ok: | reexec | 1m 1s | Docker mode activated. |
| _ Prechecks _ | |||
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. |
| +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. |
| _ master Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 27s | master passed |
| +1 :green_heart: | compile | 0m 17s | master passed |
| +1 :green_heart: | checkstyle | 0m 8s | master passed |
| +1 :green_heart: | spotless | 0m 41s | branch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 0m 19s | master passed |
| _ Patch Compile Tests _ | |||
| +1 :green_heart: | mvninstall | 2m 16s | the patch passed |
| +1 :green_heart: | compile | 0m 15s | the patch passed |
| +1 :green_heart: | javac | 0m 15s | the patch passed |
| +1 :green_heart: | checkstyle | 0m 7s | the patch passed |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. |
| +1 :green_heart: | hadoopcheck | 8m 5s | Patch does not cause any errors with Hadoop 3.2.4 3.3.4. |
| +1 :green_heart: | spotless | 0m 39s | patch has no errors when running spotless:check. |
| +1 :green_heart: | spotbugs | 0m 22s | the patch passed |
| _ Other Tests _ | |||
| +1 :green_heart: | asflicense | 0m 10s | The patch does not generate ASF License warnings. |
| 22m 4s |
| Subsystem | Report/Notes |
|---|---|
| Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/artifact/yetus-general-check/output/Dockerfile |
| GITHUB PR | https://github.com/apache/hbase/pull/4719 |
| Optional Tests | dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile |
| uname | Linux 4fb85b3c2d41 5.4.0-122-generic #138-Ubuntu SMP Wed Jun 22 15:00:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 950ad8dd3e |
| Default Java | AdoptOpenJDK-1.8.0_282-b08 |
| Max. process+thread count | 69 (vs. ulimit of 30000) |
| modules | C: hbase-metrics U: hbase-metrics |
| Console output | https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4719/1/console |
| versions | git=2.17.1 maven=3.6.3 spotbugs=4.2.2 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
I can't figure out why snapshot will be reset after any get request. The count of ops will keeps growing while the quantiles are reset.
I'll fix the test error if this patch needed
First query jmx
"FlushTime_num_ops": 11142,
"FlushTime_min": 1201,
"FlushTime_max": 4614,
"FlushTime_mean": 2437,
"FlushTime_25th_percentile": 1344,
"FlushTime_median": 2383,
"FlushTime_75th_percentile": 2647,
"FlushTime_90th_percentile": 4607,
"FlushTime_95th_percentile": 4610,
"FlushTime_98th_percentile": 4612,
"FlushTime_99th_percentile": 4613,
"FlushTime_99.9th_percentile": 4613,
"FlushTime_TimeRangeCount_1000-3000": 4,
"FlushTime_TimeRangeCount_3000-10000": 1,
Next
"FlushTime_num_ops": 11191,
"FlushTime_min": 534,
"FlushTime_max": 3932,
"FlushTime_mean": 2835,
"FlushTime_25th_percentile": 2676,
"FlushTime_median": 3104,
"FlushTime_75th_percentile": 3531,
"FlushTime_90th_percentile": 3913,
"FlushTime_95th_percentile": 3922,
"FlushTime_98th_percentile": 3928,
"FlushTime_99th_percentile": 3930,
"FlushTime_99.9th_percentile": 3931,
"FlushTime_TimeRangeCount_300-1000": 1,
"FlushTime_TimeRangeCount_1000-3000": 2,
"FlushTime_TimeRangeCount_3000-10000": 4,
Some metrics like FlushTime_max , FlushTime_TimeRangeCount_1000-3000 are reset
@bbeaudreault i believe we are interested in this change.
Yea, I think there are a couple problems with this approach:
- The histogram is initially created with some generic buckets that are used to create the distribution, which is then used to calculate the percentiles. Those generic buckets will get more accurate over time, because when you snapshotAndReset they use the boundaries of the old bins to modify the new bins. Take a look at the
Binsconstructor, which is called in the snapshotAndReset method. I would expect that if we never do snapshotAndReset, we'll have less accurate percentiles especially for outliers. This seems problematic given the typical use-case is for looking at 99th and 99.9th percentiles. - The FastLongHistogram has a few usages outside of jmx metrics. I haven't audited them, but we should be sure that any change here will not adversely affect the expectations of those usages.
We'll need to address those issues in a way that still achieves the goal.