provides stats operators and supporting utility
This PR introduces a Stats utility that allows stats reporting to a logger
Motivation
Usually, it is challenging to analyze what is going on within your data pipeline. Metrics are useful to get general "performance" insights for your stream. Debug hooks allow detecting the root cause operator. However, there is a gap in detecting general issues, like timeouts, which operator underperform, etc.
Solution
In order to simplify the process of the mentioned problems detections, this PR introduces an extra StatsOperator which builds a graph from the whole execution pipeline and then collect all important information like:
- The Number of produced items.
- The Number of requested items.
- The average execution time of that particular operator (is different from one provided by metrics).
- The Last Observed signal (e.g. what happened last before
cancel | error | complete) - The Current State of that operator (e.g.
canceled|completed|errored|none)
All this information is printed after the execution in the well-formed stack trace like output which looks like the following
10:43:19.381 [parallel-1] INFO r.c.publisher.DefaultStatsReporter - The following stats are collected:
| Flux.range ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:23)
|_ ↳ Stats(Requested: 256; Produced: 100; OnNext ~Time: 450ns; Last Signal: onNext; State: canceled)
| Flux.map ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:24)
|_ ↳ Stats(Requested: 256; Produced: 100; OnNext ~Time: 728ns; Last Signal: onNext; State: canceled)
| Flux.transform ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:25)
|_ ↳ Stats(Requested: 256; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
| Flux.map ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:29)
|_ ↳ Stats(Requested: 256; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
| Flux.subscribeOn ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:30)
|_ ↳ Stats(Requested: 256; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
| Flux.filter ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:31)
|_ ↳ Stats(Requested: 256; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
| Flux.publishOn ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:33)
|_ ↳ Stats(Requested: ∞; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
| Flux.timeout ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:34)
|_ ↳ Stats(Requested: ∞; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: errored)
| Flux.metrics ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:35)
|_ ↳ Stats(Requested: ∞; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: errored)
In order to print such an output, StatsOperator uses AssemblySnapshot. Also, there is an abstract utility like StatsReporter which allows configuring where such output should be printed and what is the format of the output (e.g. one might want to see such an out as Graphviz diagram).
Also, this PR introduced output analyzers, called StatsMarkers which can add an extra marker to the output to say that this particular operator is the source of, for instance, timeout:
10:43:19.381 [parallel-1] INFO r.c.publisher.DefaultStatsReporter - The following stats are collected:
...
🔎 | Flux.transform ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:25)
|_ ↳ Stats(Requested: 256; Produced: 0; OnNext ~Time: 0ns; Last Signal: request; State: canceled)
...
HowTo Use
In order to track advanced debug information and collect reports, you just need to use Hook API and enable the following 2 properties:
Hooks.enableStatsRecording();
Hooks.onOperatorDebug();
Having that enabled, you may run your app, and if an exception occurs, you will see a detailed report which has extra info on what might cause the error
Signed-off-by: Oleh Dokuka [email protected]
Codecov Report
Merging #2219 into master will decrease coverage by
0.38%. The diff coverage is60.64%.
@@ Coverage Diff @@
## master #2219 +/- ##
============================================
- Coverage 83.82% 83.44% -0.39%
- Complexity 4149 4227 +78
============================================
Files 380 387 +7
Lines 30915 31441 +526
Branches 5723 5800 +77
============================================
+ Hits 25916 26236 +320
- Misses 3444 3618 +174
- Partials 1555 1587 +32
| Impacted Files | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| ...a/reactor/core/publisher/ConnectableFluxStats.java | 0.00% <0.00%> (ø) |
0.00 <0.00> (?) |
|
| ...java/reactor/core/publisher/ParallelFluxStats.java | 0.00% <0.00%> (ø) |
0.00 <0.00> (?) |
|
| ...r-core/src/main/java/reactor/util/stats/Stats.java | 44.44% <44.44%> (ø) |
6.00 <6.00> (?) |
|
| ...rc/main/java/reactor/core/publisher/FluxStats.java | 57.81% <57.81%> (ø) |
3.00 <3.00> (?) |
|
| ...re/src/main/java/reactor/core/publisher/Hooks.java | 90.81% <71.42%> (-2.37%) |
53.00 <7.00> (+7.00) |
:arrow_down: |
| ...re/src/main/java/reactor/util/stats/StatsNode.java | 78.46% <78.46%> (ø) |
17.00 <17.00> (?) |
|
| ...n/java/reactor/util/stats/LoggerStatsReporter.java | 90.41% <90.41%> (ø) |
39.00 <39.00> (?) |
|
| ...rc/main/java/reactor/core/publisher/MonoStats.java | 100.00% <100.00%> (ø) |
2.00 <2.00> (?) |
|
| ...in/java/reactor/core/publisher/FluxWindowWhen.java | 84.13% <0.00%> (-1.45%) |
2.00% <0.00%> (ø%) |
|
| ...eactor/core/publisher/ParallelMergeSequential.java | 84.97% <0.00%> (-0.52%) |
7.00% <0.00%> (ø%) |
|
| ... and 18 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 50d5f29...5323b3e. Read the comment docs.
Codecov Report
Merging #2219 (5323b3e) into master (40715df) will decrease coverage by
0.30%. The diff coverage is60.64%.
@@ Coverage Diff @@
## master #2219 +/- ##
============================================
- Coverage 83.75% 83.44% -0.31%
+ Complexity 4716 4227 -489
============================================
Files 391 387 -4
Lines 32292 31441 -851
Branches 6207 5800 -407
============================================
- Hits 27046 26236 -810
- Misses 3532 3618 +86
+ Partials 1714 1587 -127
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 2f5eff4...5bf9b97. Read the comment docs.
The last interaction here was over 3 years ago. Closing due to inactivity. Please create a new PR targeting the latest codebase in case you feel this is still applicable.