[improve][broker] Improve the performance of TopicName
Motivation
The TopicName's constructor has poor performance:
NamespaceName#getis very slowSplitter.on("/").splitToList(rest)is slowString.formatis slower than the+operation on strings andcompleteTopicNameis unnecessarily created again
Modifications
- ~~Initialize
NamespaceNamein a lazy way~~ (don't do that because it assumes the constructor is called more frequently thangetNamespaceorgetNamespaceObject) - Replace
Splitterwith a manually writtensplitBySlashintroduced from https://github.com/apache/pulsar/pull/24465. Actually,StringUtils#splithas good performance as well. But it will split"//xxx/yyy/zzz"to["xxx", "yyy", "zzz"]without reporting any error. - Initialize
completeTopicNamefrom the argument directly without any concentrate operation - Replace
String.formatwith+infromPersistenceNamingEncoding
Documentation
- [ ]
doc - [ ]
doc-required - [x]
doc-not-needed - [ ]
doc-complete
Matching PR in forked repository
PR in forked repository: https://github.com/BewareMyPower/pulsar/pull/44
Before 48ffb7abfe1b5d5ad71cc3a07627326fb84c7a5a, there is a constructor parameter that determines whether to initialize the NamespaceName. I also compared with the existing TopicName constructor, which can be found here: https://gist.github.com/BewareMyPower/6b83c897552c110f336e51965cc91c24
The benchmark result is:
TopicNameBenchmark.testConstruct thrpt 312.983 ops/s
TopicNameBenchmark.testConstructWithNamespaceInitialized thrpt 101.973 ops/s
TopicNameBenchmark.testLegacyTopicName thrpt 53.944 ops/s
TopicNameBenchmark.testReadFromCache thrpt 721.392 ops/s
As you can see
- Accessing it from the cache is only 2.3x faster than the latest implementation, which is not significant. However, to avoid debate on whether to remove cache in this PR, I only exposed the public constructor.
- Skipping initialization of
NamespaceNameis 3x faster than initializing it immediately - It's 6x faster than the original implementation. Even if initializing
NamespaceNameimmediately, it's still about 2x faster.
Before 48ffb7a, there is a constructor parameter that determines whether to initialize the
NamespaceName. I also compared with the existingTopicNameconstructor, which can be found here: https://gist.github.com/BewareMyPower/6b83c897552c110f336e51965cc91c24The benchmark result is:
TopicNameBenchmark.testConstruct thrpt 312.983 ops/s TopicNameBenchmark.testConstructWithNamespaceInitialized thrpt 101.973 ops/s TopicNameBenchmark.testLegacyTopicName thrpt 53.944 ops/s TopicNameBenchmark.testReadFromCache thrpt 721.392 ops/sAs you can see
- Accessing it from the cache is only 2.3x faster than the latest implementation, which is not significant. However, to avoid debate on whether to remove cache in this PR, I only exposed the public constructor.
- Skipping initialization of
NamespaceNameis 3x faster than initializing it immediately- It's 6x faster than the original implementation. Even if initializing
NamespaceNameimmediately, it's still about 2x faster.
I wouldn't trust JMH results from benchmarking on Mac OS. In the case of PR #24457, the results are very different when benchmarking on Linux x86_64 with Intel i9 processor (Dell XPS 2019 laptop). If we'd completely remove the cache, there would be more duplicate TopicName and NamespaceName instances and more duplicate string instance. For very long tenant, namespace and topic names, that could add up a significant amount of wasted heap memory.
@lhotari So could you help run benchmark in your Linux environment to see the difference? I can also create a workflow via GitHub Actions to see the result.
If we'd completely remove the cache, there would be more duplicate TopicName and NamespaceName instances and more duplicate string instance. For very long tenant, namespace and topic names, that could add up a significant amount of wasted heap memory.
I don't want to debate about it in this PR. If you have a chance to look at how TopicName is used in Pulsar, you will find a lot of temporary TopicName instances just for conversion. I know near all Pulsar constributors are very sensitive to the topic about GC, so I don't change the TopicName#get implementation.
The ultimate solution is to create some utils methods instead of leveraging the TopicName class.
Updated test results to compare it with the legacy implementation in https://github.com/BewareMyPower/pulsar/actions/runs/15872434081/job/44752001433?pr=45
TopicNameBenchmark.testConstruct thrpt 258.932 ops/s
TopicNameBenchmark.testConstructWithNamespaceInitialization thrpt 158.876 ops/s
TopicNameBenchmark.testLegacyConstruct thrpt 44.290 ops/s
TopicNameBenchmark.testReadFromCache thrpt 340.613 ops/s
- The constructor is 5.8x faster than the existing one
- The constructor with
NamespaceNameinitialization is 3.6x faster than the existing one
@lhotari @codelipenghui @nodece @dao-jun @poorbarcode @coderzc This PR is now ready to review, PTAL
@BewareMyPower My point from an earlier comment isn't addressed:
If we'd completely remove the cache, there would be more duplicate TopicName and NamespaceName instances and more duplicate string instance. For very long tenant, namespace and topic names, that could add up a significant amount of wasted heap memory.
In this case, I think it's irrelevant to just benchmark the performance of creating/looking up TopicName instances.
Duplicate java.lang.String instances could increase significantly after introducing a non-caching solution. Heapdump analysis tools have such checks for duplicate instances so that's one way how to compare the difference.
I do agree that a lot of the TopicName handling code is a mess. For example in Topic listing, the topic name is converted from/to String multiple times. That's adding up a lot of pressure for a fast lookup solution when instances are cached. So I'm not against changing the current caching, it's just that there's a need to consider duplicate instances as well. It's likely that the caching is more relevant for NamespaceName than TopicName regarding duplicate instances. We might not be keeping a reference to TopicName instance in that many places when broker is running.
If we'd completely remove the cache,
I've read more code and changed my idea a bit. Caching could be helpful in many cases. But how to establish the cache might depend on specific use cases. Writing a common cache is hard. I don't like the solution in https://github.com/apache/pulsar/pull/23052, but it's anyway good to resolve the issue encountered at that time.
As I've mentioned here, exposing the public method is helpful for downstream to construct its own cache. In addition, the TopicName can be cached as well as some other classes, e.g. Map<String, Pair<TopicName, PersistentTopic>> would be better than maintaining two maps:
- A built-in
Map<String, TopicName> - An external
Map<String, PersistentTopic>
It's just an example, PersistentTopic can be replaced by another topic abstraction class.
Regarding this PR, I'm going to revert other changes and only leaving the improvement on TopicName's constructor itself.
Exposing a public method will be easier for the downstream to maintain its custom cache, but it will also be confusing for the core Pulsar developers to make the decision on TopicName.get and new TopicName.
Anyway, we should improve the use of TopicName case by case. Take BrokerService for example, a TopicName is passed to getTopic, but the string returned by toString() is passed to
loadOrCreatePersistentTopicTopicLoadingContext#topic
The TopicName instance is constructed again by them to checkOwnershipAndCreatePersistentTopic.
In this case, TopicName#get makes sense than new TopicName. But the code is still inefficient, we should not perform the unnecessary topicName.toString() -> TopicName.get(topic) conversions.
This PR will be an improvement. Regarding the argument I made in a previous comment about duplicate tenant and namespace java.lang.String instances, that problem is already present in the current code base. That could be solved in a different PR.