[improve][broker] Add msgInReplay subscription stat and metric to improve Key_Shared observability
Main Issue: #23205
Motivation
Currently there aren't ways to observe the number of messages that are "in replay" with Key_Shared or Shared subscriptions. Having these metrics would make it easier to understand the behavior of issues such as #23200 . Please check #23205 for more reasons.
Modifications
- add
msgInReplayin the similar way as the existingmsgDelayedsubscription stat and metric.
Documentation
- [ ]
doc - [x]
doc-required - [ ]
doc-not-needed - [ ]
doc-complete
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 74.54%. Comparing base (bbc6224) to head (d6d08be).
:warning: Report is 1267 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #23224 +/- ##
============================================
+ Coverage 73.57% 74.54% +0.96%
- Complexity 32624 34226 +1602
============================================
Files 1877 1922 +45
Lines 139502 144768 +5266
Branches 15299 15832 +533
============================================
+ Hits 102638 107916 +5278
+ Misses 28908 28578 -330
- Partials 7956 8274 +318
| Flag | Coverage Δ | |
|---|---|---|
| inttests | 27.66% <28.57%> (+3.08%) |
:arrow_up: |
| systests | 24.69% <28.57%> (+0.36%) |
:arrow_up: |
| unittests | 73.90% <100.00%> (+1.05%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files with missing lines | Coverage Δ | |
|---|---|---|
| ...ervice/persistent/MessageRedeliveryController.java | 95.38% <100.00%> (+0.38%) |
:arrow_up: |
| ...sistent/PersistentDispatcherMultipleConsumers.java | 73.37% <100.00%> (-0.96%) |
:arrow_down: |
| ...ker/service/persistent/PersistentSubscription.java | 77.54% <100.00%> (+0.85%) |
:arrow_up: |
| ...ker/stats/prometheus/AggregatedNamespaceStats.java | 100.00% <100.00%> (ø) |
|
| .../stats/prometheus/AggregatedSubscriptionStats.java | 100.00% <ø> (ø) |
|
| ...ker/stats/prometheus/NamespaceStatsAggregator.java | 94.24% <100.00%> (+3.83%) |
:arrow_up: |
| ...che/pulsar/broker/stats/prometheus/TopicStats.java | 98.42% <100.00%> (+4.22%) |
:arrow_up: |
| ...mon/policies/data/stats/SubscriptionStatsImpl.java | 94.40% <100.00%> (+0.28%) |
:arrow_up: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
LGTM, but maybe the change needs a PIP?
+1 for having a PIP for this change since it actually changed the Public API output and metrics.
Another point is can we only have new the field in the topic stats? Since we have a lot of metrics on the topic level. Finally, we can't add everything to the prometheus metrics. The stats, the logs are also the part of observability.
@codelipenghui This is not a breaking change in metrics and this change isn't adding significantly more metrics than what currently exists. It's currently a gap in observability of all released versions of Pulsar when msgInReplay is missing. That's why everyone needs this and making it configurable with a feature flag wouldn't be sensible.