[feat][misc] PIP-264: Implement topic lookup metrics using OpenTelemetry
Motivation
Part of PIP-264 related work to convert metrics to OpenTelemetry format. This PR handles topic lookup metrics, as currently defined in the documentation.
Note that metric pulsar_broker_load_manager_bundle_assignment fits better into the Load Balancing metrics section, so is not included in this work.
Modifications
Added metrics as described below:
| Old metric name | New metric details | |
|---|---|---|
pulsar_broker_lookup |
Name | pulsar.broker.lookup.latency |
| Type | Histogram | |
| Description | Lookup request latency | |
| Source | Internal OpenTelemetry implementation | |
| Attributes | (none) | |
| Cardinality | 1 per broker | |
pulsar_broker_lookup_redirects |
Name | pulsar.broker.lookup.redirect |
| Type | LongCounter | |
| Description | The number of lookup redirected requests | |
| Source | Internal OpenTelemetry implementation | |
| Attributes | (none) | |
| Cardinality | 1 per broker | |
pulsar_broker_lookup_answers |
Name | pulsar.broker.lookup.answer |
| Type | LongCounter | |
| Description | The number of lookup responses (i.e. not redirected requests) | |
| Source | Internal OpenTelemetry implementation | |
| Attributes | (none) | |
| Cardinality | 1 per broker | |
pulsar_broker_lookup_failures |
Name | pulsar.broker.lookup.failure |
| Type | LongCounter | |
| Description | The number of lookup failures | |
| Source | Internal OpenTelemetry implementation | |
| Attributes | (none) | |
| Cardinality | 1 per broker | |
pulsar_broker_lookup_pending_requests |
Name | pulsar.broker.lookup.pending.request.usage |
| Type | ObservableLongGauge | |
| Description | The number of pending lookup requests in the broker. When it reaches threshold "maxConcurrentLookupRequest" defined in broker.conf, new requests are rejected. | |
| Source | Call to BrokerService#getPendingLookupRequest |
|
| Attributes | (none) | |
| Cardinality | 1 per broker | |
pulsar_broker_topic_load_pending_requests |
Name | pulsar.broker.topic.load.pending.request.usage |
| Type | ObservableLongGauge | |
| Description | The number of pending topic load operations in the broker. When it reaches threshold "maxConcurrentTopicLoadRequest" defined in broker.conf, new requests are rejected. | |
| Source | Call to BrokerService#getPendingTopicLoadRequests |
|
| Attributes | (none) | |
| Cardinality | 1 per broker |
Moved instantiation of PulsarBrokerOpenTelemetry object to PulsarService constructor, as it is needed by the NamespaceService and BrokerService. The OpenTelemetry standard recommends setting up the SDK as early as possible, so this suits us well.
Changed PulsarTestContext to allow OpenTelemetry metrics validation as part of the tests.
Verifying this change
- [x] Make sure that the change passes the CI checks.
This change added tests and can be verified as follows:
(example:)
- Extended existing lookup tests
BrokerServiceLookupTest#testMultipleBrokerLookupandBrokerServiceLookupTest#testLookupConnectionNotCloseIfGetUnloadingExOrMetadataExto cover the metric validation
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
- [ ] Dependencies (add or upgrade a dependency)
- [ ] The public API
- [ ] The schema
- [ ] The default values of configurations
- [ ] The threading model
- [ ] The binary protocol
- [ ] The REST endpoints
- [ ] The admin CLI options
- [x] The metrics
- [ ] Anything that affects deployment
Documentation
- [ ]
doc - [x]
doc-required - [ ]
doc-not-needed - [ ]
doc-complete
Matching PR in forked repository
PR in forked repository: https://github.com/dragosvictor/pulsar/pull/11
@dragosvictor Please add the following content to your PR description and select a checkbox:
- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->
Codecov Report
Attention: 5 lines in your changes are missing coverage. Please review.
Comparison is base (
beed0cf) 73.71% compared to head (1e9bc12) 73.59%. Report is 2 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #22058 +/- ##
============================================
- Coverage 73.71% 73.59% -0.13%
- Complexity 32116 32571 +455
============================================
Files 1874 1875 +1
Lines 139220 139276 +56
Branches 15260 15260
============================================
- Hits 102628 102500 -128
- Misses 28695 28858 +163
- Partials 7897 7918 +21
| Flag | Coverage Δ | |
|---|---|---|
| inttests | 24.68% <89.55%> (-0.16%) |
:arrow_down: |
| systests | 24.29% <80.59%> (-0.22%) |
:arrow_down: |
| unittests | 72.86% <92.53%> (-0.12%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files | Coverage Δ | |
|---|---|---|
| ...n/java/org/apache/pulsar/broker/PulsarService.java | 83.13% <100.00%> (ø) |
|
| ...ache/pulsar/broker/namespace/NamespaceService.java | 73.03% <100.00%> (+0.79%) |
:arrow_up: |
| ...pulsar/broker/stats/PulsarBrokerOpenTelemetry.java | 100.00% <100.00%> (+9.09%) |
:arrow_up: |
| ...che/pulsar/opentelemetry/OpenTelemetryService.java | 92.00% <ø> (ø) |
|
| ...va/org/apache/pulsar/common/stats/MetricsUtil.java | 50.00% <50.00%> (ø) |
|
| ...rg/apache/pulsar/broker/service/BrokerService.java | 81.38% <88.57%> (+0.26%) |
:arrow_up: |
@dragosvictor Thanks. I replied to all comments.
@lhotari You're up next on the review wrap up :)