pulsar Replace the ConcurrentOpenHashMap as much as possible

Search before asking

[X] I searched in the issues and found nothing similar.

Motivation

There was a discussion at Sep. 2023 before to replace Customized Map with ConcurrentOpenHashMap. In this issue, I'd focus on the ConcurrentOpenHashMap.

Here is the only advantage of this map.

Provides similar methods as a {@code ConcurrentMap<K,V>} but since it's an open hash map with linear probing, no node allocations are required to store the values.

Let me count the disadvantages.

1. Bad performance

This map was aded in the initial commit (on 2016). However, this implementation was just based on the Java 7 implementation of the ConcurrentHashMap, which uses a segment based lock. Actually, this solution was discarded by the Java team since Java 8.

https://github.com/apache/pulsar/pull/20647/#issuecomment-1607257960 did a benchmark and found the performance was much worse than the current ConcurrentHashMap provided by Java library. We can also search the PROs of the Java 8 design in network, or just ask for ChatGPT.

Besides, the frequently used keys() and values() methods just copy the keys and values to a new list. While the ConcurrentHashMap just returns a thread-safe internal view that users can choose whether to make a copy.

Anyway, to prove the performance is worse than ConcurrentHashMap, we need to have more tests and research. So it's the least important reason.

2. Lack of the updates

This class was rarely updated. What I can remember is the shrink support two years ago. https://github.com/apache/pulsar/pull/14663

From https://github.com/apache/bookkeeper/pull/3061, we can see the motivation is the frequently appeared Full GC caused by this implementation. However, adding a shrink method makes it harder to use. There are already many parameters to tune, see it's builder:

    public static class Builder<K, V> {
        int expectedItems = DefaultExpectedItems;
        int concurrencyLevel = DefaultConcurrencyLevel;
        float mapFillFactor = DefaultMapFillFactor;
        float mapIdleFactor = DefaultMapIdleFactor;
        float expandFactor = DefaultExpandFactor;
        float shrinkFactor = DefaultShrinkFactor;
        boolean autoShrink = DefaultAutoShrink;

Many xxxFactors and the concurrency level. It's hard to determine a proper value by default. However, it makes new developers hard to modify it.

3. Bad debug experience

When I debugged the topics maintained in a BrokerService.

As you can see. There are 16 sections. And I have to iterate over all these sections and expand the table array to find the target topic.

Let's compare it with the official ConcurrentHashMap (I replaced it locally)

Besides, it's even harder to analyze in the heap dump.

4. Not friendly to new developers

Many places just use it as a concurrent hash map. What's the reason for new developers to not use the official ConcurrentHashMap, which is developed and consistently improved by a professional team? Just to reduce the node allocation? With the improving JVM GC?

As I've mentioned, this class might be introduced at the Java 7 era. Now the minimum required Java version of broker side is 17. We have ZGC. We have Shenandoah GC. We have many more JVM developers developing better GC. I'm suspecting if the advantage makes sense.

I cannot think of a reason to choose this hard-to-maintain class rather than well-maintained official ConcurrentHashMap.

For example, when I maintained KoP, I encountered the deadlock of ConcurrentLongHashMap (maybe the similar implementation). https://github.com/streamnative/kop/pull/620 And it's hard to know if this case is fixed. So I have to switch to the official ConcurrentHashMap.

Solution

Replace ConcurrentOpenHashMap with the official Java ConcurrentHashMap.

Alternatives

N/A

Anything else?

Java Concurrent HashMap Improvements over the years https://medium.com/@vikas.taank_40391/java-concurrent-hashmap-improvements-over-the-years-8d8b7be6ce37

Are you willing to submit a PR?

[X] I'm willing to submit a PR!

PR list

[ ] https://github.com/apache/pulsar/pull/23217

Aug 23 '24 16:08 BewareMyPower

https://github.com/apache/pulsar/pull/12729 shows the ConcurrentOpenHashMap allows null values. It adds a removeNullValue public method to handle the null values. This is very error-prone.

Here is an example:

        final var map = new ConcurrentHashMap<String, String>();
        map.computeIfAbsent("A", __ -> null);
        System.out.println(map.size());

        final var map2 = new ConcurrentOpenHashMap<String, String>();
        map2.computeIfAbsent("A", __ -> null);
        System.out.println(map2.size());

Outputs:

0
1

The caller of ConcurrentOpenHashMap#computeIfAbsent needs to manually check the return value and call removeNullValue to handle the case a null value is inserted. What a terrible API design :(

Aug 24 '24 06:08 BewareMyPower

I added some benchmarks with different threads (by modifying the BenchmarkState.numThreads field)

TL; DR

There is no reason to use ConcurrentOpenHashMap for the sake of performance.

computeIfAbsent

https://gist.github.com/BewareMyPower/1083937e30cb0f5a63be74c4ee9c5559

Generate 10000 keys in range [0, 100) in advance
Call computeIfAbsent in N threads on all keys

2 threads

Benchmark                                             Mode  Cnt     Score      Error  Units
ConcurrentHashMapTest.testConcurrentOpenHashMap      thrpt    5  2875.969 ±  189.894  ops/s
ConcurrentHashMapTest.testOfficialConcurrentHashMap  thrpt    5  5252.836 ± 1312.977  ops/s
ConcurrentHashMapTest.testSynchronizedHashMap        thrpt    5   996.854 ±   21.998  ops/s

ConcurrentHashMap is 1.8x faster than the ConcurrentOpenHashMap.

4 threads

Benchmark                                             Mode  Cnt     Score      Error  Units
ConcurrentHashMapTest.testConcurrentOpenHashMap      thrpt    5  1460.855 ±  152.361  ops/s
ConcurrentHashMapTest.testOfficialConcurrentHashMap  thrpt    5  3750.708 ± 1324.286  ops/s
ConcurrentHashMapTest.testSynchronizedHashMap        thrpt    5   270.024 ±   11.736  ops/s

ConcurrentHashMap is 2.5x faster than the ConcurrentOpenHashMap.

8 threads

Benchmark                                             Mode  Cnt     Score     Error  Units
ConcurrentHashMapTest.testConcurrentOpenHashMap      thrpt    5   648.256 ± 278.038  ops/s
ConcurrentHashMapTest.testOfficialConcurrentHashMap  thrpt    5  2383.684 ± 663.103  ops/s
ConcurrentHashMapTest.testSynchronizedHashMap        thrpt    5   130.503 ±   4.253  ops/s

ConcurrentHashMap is 3.6x faster than ConcurrentOpenHashMap.

16 threads

Benchmark                                             Mode  Cnt     Score     Error  Units
ConcurrentHashMapTest.testConcurrentOpenHashMap      thrpt    5   259.008 ± 204.781  ops/s
ConcurrentHashMapTest.testOfficialConcurrentHashMap  thrpt    5  1443.469 ± 368.230  ops/s
ConcurrentHashMapTest.testSynchronizedHashMap        thrpt    5    59.875 ±   2.984  ops/s

ConcurrentHashMap is 5.5x faster than ConcurrentOpenHashMap

Conclusion

ConcurrentHashMap performs definitely much better. With the number of threads increasing, the gap becomes much larger.

get

https://gist.github.com/BewareMyPower/b6254ec64932c3cf86b6ec7433a631da

Generate 5000 random keys (might be duplicated) in range [0, 10000) and insert them
Call get on key from 0 to 10000 (inclusive) in N threads

2 threads

Benchmark                                                Mode  Cnt     Score      Error  Units
ConcurrentHashMapGetTest.testConcurrentOpenHashMap      thrpt    5  3741.910 ±  206.337  ops/s
ConcurrentHashMapGetTest.testOfficialConcurrentHashMap  thrpt    5  5807.285 ± 1896.396  ops/s
ConcurrentHashMapGetTest.testSynchronizedHashMap        thrpt    5   870.238 ±   66.370  ops/s

4 threads

Benchmark                                                Mode  Cnt     Score      Error  Units
ConcurrentHashMapGetTest.testConcurrentOpenHashMap      thrpt    5  3485.099 ±  269.270  ops/s
ConcurrentHashMapGetTest.testOfficialConcurrentHashMap  thrpt    5  3777.292 ± 1120.207  ops/s
ConcurrentHashMapGetTest.testSynchronizedHashMap        thrpt    5   256.690 ±   16.778  ops/s

8 threads

Benchmark                                                Mode  Cnt     Score     Error  Units
ConcurrentHashMapGetTest.testConcurrentOpenHashMap      thrpt    5  2318.738 ± 135.056  ops/s
ConcurrentHashMapGetTest.testOfficialConcurrentHashMap  thrpt    5  2568.236 ± 828.029  ops/s
ConcurrentHashMapGetTest.testSynchronizedHashMap        thrpt    5   119.749 ±   2.051  ops/s

16 threads

Benchmark                                                Mode  Cnt     Score     Error  Units
ConcurrentHashMapGetTest.testConcurrentOpenHashMap      thrpt    5  1711.945 ± 128.223  ops/s
ConcurrentHashMapGetTest.testOfficialConcurrentHashMap  thrpt    5  1459.255 ± 431.960  ops/s
ConcurrentHashMapGetTest.testSynchronizedHashMap        thrpt    5    59.398 ±   2.085  ops/s

32 threads

Benchmark                                                Mode  Cnt     Score     Error  Units
ConcurrentHashMapGetTest.testConcurrentOpenHashMap      thrpt    5   863.625 ±  36.957  ops/s
ConcurrentHashMapGetTest.testOfficialConcurrentHashMap  thrpt    5  1001.633 ± 222.766  ops/s
ConcurrentHashMapGetTest.testSynchronizedHashMap        thrpt    5    28.798 ±   4.943  ops/s

Conclusion

When the number of threads is small, ConcurrentHashMap has better performance though they are similar. It should be noted that when the number of threads is 16, ConcurrentOpenHashMap performs better. It might also because the default concurrency level is 16 for ConcurrentOpenHashMap. With 32 threads, ConcurrentHashMap performs better again.

Aug 24 '24 08:08 BewareMyPower

I support this initiative. It is a little bit scary to change this everywhere, maybe we can have a few smaller PRs ?

Aug 24 '24 11:08 eolivelli

Okay, let me split https://github.com/apache/pulsar/pull/23216 into multiple PRs later.

Aug 24 '24 17:08 BewareMyPower

https://github.com/apache/pulsar/pull/23217 is the first PR to replace the ConcurrentOpenHashMap in load manager. However, it's a bug fix rather than an improvement because the current use of the concurrent hash map is wrong.

Aug 25 '24 14:08 BewareMyPower

It might be a big changes. Is it possible to remain ConcurrentOpenHashMap interface, and use ConcurrentHashMap as ConcurrentOpenHashMap interface realization? So that we may only need to modify ConcurrentOpenHashMap file.

Aug 26 '24 13:08 geniusjoe

It might be a big changes. Is it possible to remain ConcurrentOpenHashMap interface, and use ConcurrentHashMap as ConcurrentOpenHashMap interface realization? So that we may only need to modify ConcurrentOpenHashMap file.

It will produce much more garbage code in Pulsar.

The idea to reduce code changes seems good, but the ConcurrentOpenHashMap is not fully compatible with ConcurrentHashMap. Let me count one by one.

Constructor changes.

                ConcurrentOpenHashMap.<TopicName,
                        PersistentOfflineTopicStats>newBuilder()./* ... */build();

You still need to keep the builder with all meaningless parameters like concurrencyLevel, which is confusing.

Non-standard keys() and values() APIs

    public List<K> keys() {
        List<K> keys = new ArrayList<>((int) size());
        forEach((key, value) -> keys.add(key));
        return keys;
    }

    public List<V> values() {
        List<V> values = new ArrayList<>((int) size());
        forEach((key, value) -> values.add(value));
        return values;
    }

ConcurrentHashMap does not provide these methods because of the overhead of copying keys or values into a new collection.

The removeNullValue method, which is really wrong and should not exist.

So just let me split the huge PR into multiple relatively small PRs.

Sep 19 '24 10:09 BewareMyPower

It will produce much more garbage code in Pulsar.

The idea to reduce code changes seems good, but the ConcurrentOpenHashMap is not fully compatible with ConcurrentHashMap. Let me count one by one.

Constructor changes.
                ConcurrentOpenHashMap.<TopicName,
                        PersistentOfflineTopicStats>newBuilder()./* ... */build();
You still need to keep the builder with all meaningless parameters like concurrencyLevel, which is confusing.

Non-standard keys() and values() APIs
    public List<K> keys() {
        List<K> keys = new ArrayList<>((int) size());
        forEach((key, value) -> keys.add(key));
        return keys;
    }

    public List<V> values() {
        List<V> values = new ArrayList<>((int) size());
        forEach((key, value) -> values.add(value));
        return values;
    }
ConcurrentHashMap does not provide these methods because of the overhead of copying keys or values into a new collection.

The removeNullValue method, which is really wrong and should not exist.

So just let me split the huge PR into multiple relatively small PRs.

Ok. It sounds we would better replace the whole ConcurrentHashMap.

Sep 19 '24 12:09 geniusjoe

Now all PRs are merged so that Pulsar 4.0.0 won't have any ConcurrentOpenHashMap. Close this issue.

Sep 23 '24 02:09 BewareMyPower

pulsar pulsar copied to clipboard

Replace the ConcurrentOpenHashMap as much as possible

Search before asking

Motivation

1. Bad performance

2. Lack of the updates

3. Bad debug experience

4. Not friendly to new developers

Solution

Alternatives

Anything else?

Are you willing to submit a PR?

PR list

TL; DR

computeIfAbsent

2 threads

4 threads

8 threads

16 threads

Conclusion

get

2 threads

4 threads

8 threads

16 threads

32 threads

Conclusion

pulsar
pulsar copied to clipboard