reactor-core icon indicating copy to clipboard operation
reactor-core copied to clipboard

provides stats operators and supporting utility

Open OlegDokuka opened this issue 5 years ago • 2 comments

This PR introduces a Stats utility that allows stats reporting to a logger

Motivation

Usually, it is challenging to analyze what is going on within your data pipeline. Metrics are useful to get general "performance" insights for your stream. Debug hooks allow detecting the root cause operator. However, there is a gap in detecting general issues, like timeouts, which operator underperform, etc.

Solution

In order to simplify the process of the mentioned problems detections, this PR introduces an extra StatsOperator which builds a graph from the whole execution pipeline and then collect all important information like:

  1. The Number of produced items.
  2. The Number of requested items.
  3. The average execution time of that particular operator (is different from one provided by metrics).
  4. The Last Observed signal (e.g. what happened last before cancel | error | complete)
  5. The Current State of that operator (e.g. canceled | completed | errored | none)

All this information is printed after the execution in the well-formed stack trace like output which looks like the following

10:43:19.381 [parallel-1] INFO  r.c.publisher.DefaultStatsReporter - The following stats are collected:
  	|  Flux.range       ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:23)
  	|_                  ↳ Stats(Requested: 256; Produced: 100; OnNext ~Time: 450ns; Last Signal:  onNext; State: canceled)
  	|  Flux.map         ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:24)
  	|_                  ↳ Stats(Requested: 256; Produced: 100; OnNext ~Time: 728ns; Last Signal:  onNext; State: canceled)
 	|  Flux.transform   ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:25)
  	|_                  ↳ Stats(Requested: 256; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
  	|  Flux.map         ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:29)
  	|_                  ↳ Stats(Requested: 256; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
  	|  Flux.subscribeOn ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:30)
  	|_                  ↳ Stats(Requested: 256; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
  	|  Flux.filter      ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:31)
  	|_                  ↳ Stats(Requested: 256; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
  	|  Flux.publishOn   ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:33)
  	|_                  ↳ Stats(Requested:   ∞; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
  	|  Flux.timeout     ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:34)
  	|_                  ↳ Stats(Requested:   ∞; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State:  errored)
  	|  Flux.metrics     ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:35)
  	|_                  ↳ Stats(Requested:   ∞; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State:  errored)

In order to print such an output, StatsOperator uses AssemblySnapshot. Also, there is an abstract utility like StatsReporter which allows configuring where such output should be printed and what is the format of the output (e.g. one might want to see such an out as Graphviz diagram).

Also, this PR introduced output analyzers, called StatsMarkers which can add an extra marker to the output to say that this particular operator is the source of, for instance, timeout:

10:43:19.381 [parallel-1] INFO  r.c.publisher.DefaultStatsReporter - The following stats are collected:
...
🔎	|  Flux.transform   ⇢ at reactor.examples.ReportRootHangingTest.timeoutNightmare(ReportRootHangingTest.java:25)
  	|_                  ↳ Stats(Requested: 256; Produced:   0; OnNext ~Time:   0ns; Last Signal: request; State: canceled)
...

HowTo Use

In order to track advanced debug information and collect reports, you just need to use Hook API and enable the following 2 properties:

Hooks.enableStatsRecording();
Hooks.onOperatorDebug();

Having that enabled, you may run your app, and if an exception occurs, you will see a detailed report which has extra info on what might cause the error

Signed-off-by: Oleh Dokuka [email protected]

OlegDokuka avatar Jun 24 '20 07:06 OlegDokuka

Codecov Report

Merging #2219 into master will decrease coverage by 0.38%. The diff coverage is 60.64%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2219      +/-   ##
============================================
- Coverage     83.82%   83.44%   -0.39%     
- Complexity     4149     4227      +78     
============================================
  Files           380      387       +7     
  Lines         30915    31441     +526     
  Branches       5723     5800      +77     
============================================
+ Hits          25916    26236     +320     
- Misses         3444     3618     +174     
- Partials       1555     1587      +32     
Impacted Files Coverage Δ Complexity Δ
...a/reactor/core/publisher/ConnectableFluxStats.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...java/reactor/core/publisher/ParallelFluxStats.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...r-core/src/main/java/reactor/util/stats/Stats.java 44.44% <44.44%> (ø) 6.00 <6.00> (?)
...rc/main/java/reactor/core/publisher/FluxStats.java 57.81% <57.81%> (ø) 3.00 <3.00> (?)
...re/src/main/java/reactor/core/publisher/Hooks.java 90.81% <71.42%> (-2.37%) 53.00 <7.00> (+7.00) :arrow_down:
...re/src/main/java/reactor/util/stats/StatsNode.java 78.46% <78.46%> (ø) 17.00 <17.00> (?)
...n/java/reactor/util/stats/LoggerStatsReporter.java 90.41% <90.41%> (ø) 39.00 <39.00> (?)
...rc/main/java/reactor/core/publisher/MonoStats.java 100.00% <100.00%> (ø) 2.00 <2.00> (?)
...in/java/reactor/core/publisher/FluxWindowWhen.java 84.13% <0.00%> (-1.45%) 2.00% <0.00%> (ø%)
...eactor/core/publisher/ParallelMergeSequential.java 84.97% <0.00%> (-0.52%) 7.00% <0.00%> (ø%)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 50d5f29...5323b3e. Read the comment docs.

codecov-commenter avatar Jun 24 '20 07:06 codecov-commenter

Codecov Report

Merging #2219 (5323b3e) into master (40715df) will decrease coverage by 0.30%. The diff coverage is 60.64%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2219      +/-   ##
============================================
- Coverage     83.75%   83.44%   -0.31%     
+ Complexity     4716     4227     -489     
============================================
  Files           391      387       -4     
  Lines         32292    31441     -851     
  Branches       6207     5800     -407     
============================================
- Hits          27046    26236     -810     
- Misses         3532     3618      +86     
+ Partials       1714     1587     -127     
Impacted Files Coverage Δ Complexity Δ
...a/reactor/core/publisher/ConnectableFluxStats.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...java/reactor/core/publisher/ParallelFluxStats.java 0.00% <0.00%> (ø) 0.00 <0.00> (?)
...r-core/src/main/java/reactor/util/stats/Stats.java 44.44% <44.44%> (ø) 6.00 <6.00> (?)
...rc/main/java/reactor/core/publisher/FluxStats.java 57.81% <57.81%> (ø) 3.00 <3.00> (?)
...re/src/main/java/reactor/core/publisher/Hooks.java 90.81% <71.42%> (-2.37%) 53.00 <7.00> (+7.00) :arrow_down:
...re/src/main/java/reactor/util/stats/StatsNode.java 78.46% <78.46%> (ø) 17.00 <17.00> (?)
...n/java/reactor/util/stats/LoggerStatsReporter.java 90.41% <90.41%> (ø) 39.00 <39.00> (?)
...rc/main/java/reactor/core/publisher/MonoStats.java 100.00% <100.00%> (ø) 2.00 <2.00> (?)
...ava/reactor/core/publisher/MonoSourceFuseable.java 0.00% <0.00%> (-38.89%) 0.00% <0.00%> (-3.00%)
...ava/reactor/core/publisher/FluxSourceFuseable.java 56.25% <0.00%> (-31.99%) 5.00% <0.00%> (-4.00%)
... and 344 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2f5eff4...5bf9b97. Read the comment docs.

codecov-io avatar Dec 15 '20 15:12 codecov-io

The last interaction here was over 3 years ago. Closing due to inactivity. Please create a new PR targeting the latest codebase in case you feel this is still applicable.

chemicL avatar Mar 15 '24 12:03 chemicL