[TEST] Test new implementation of ConsumerRecords#records(String) in CI
I have make a new implementation of ConsumerRecords#records(String) and I want to test this implementation in CI.
Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
Hi @chia7712,
Here are the benchmark results after adding Blackhole to prevent JVM optimization. The new implementation is slightly slower, which indicates that the previous version was indeed being optimized by the JVM. However, the new implementation is still significantly faster than the current one.
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM invoker: /Users/frankvicky/.sdkman/candidates/java/17.0.11-amzn/bin/java
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.kafka.jmh.record.ConsumerRecordsBenchmark.recordsWithFilterIterator
Benchmark Mode Cnt Score Error Units
ConsumerRecordsBenchmark.records avgt 10 61.244 ± 0.194 ns/op
ConsumerRecordsBenchmark.recordsWithFilterIterator avgt 10 2.371 ± 0.046 ns/op
JMH benchmarks done
Hello @chia7712 , I have write another implementation which is almost same as original logic. The different of new one is that it filter the TopicPartition in the ConcatenatedIterable to avoid creating double array list.
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM invoker: /Users/frankvicky/.sdkman/candidates/java/17.0.11-amzn/bin/java
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# original one
ConsumerRecordsBenchmark.records avgt 10 61.881 ± 0.872 ns/op
# latest
ConsumerRecordsBenchmark.records2 avgt 10 2.206 ± 0.007 ns/op
# filter each records in ConcatenatedIterable
ConsumerRecordsBenchmark.recordsWithFilterIterator avgt 10 2.344 ± 0.021 ns/op
Hi @chia7712 I have refactor the method, PTAL 🐱
Hi @chia7712 I have both iterate test and init test, PTAL
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.kafka.jmh.record.ConsumerRecordsBenchmark.recordsByNewImplementation
Benchmark Mode Cnt Score Error Units
ConsumerRecordsBenchmark.iteratorRecords avgt 10 288547.940 ± 25990.307 ns/op
ConsumerRecordsBenchmark.iteratorRecordsByNewImplementation avgt 10 266188.726 ± 40392.469 ns/op
ConsumerRecordsBenchmark.records avgt 10 61.363 ± 0.084 ns/op
ConsumerRecordsBenchmark.recordsByNewImplementation avgt 10 1.113 ± 0.004 ns/op
@frankvicky the jmh result is good to me. Could you please adjust the PR to add a subclass of ConsumerRecords? that subclass will use the legacy records(String) and then please rerun the jmh again. Thus, we can have the new impl in the production and the legacy code in jmh for comparison.
Hi @chia7712 I have make some refactors based on comment, PTAL 😃
@frankvicky please add jmh result (according to latest commit) to the description
Hi @chia7712 I have move the subclass to jmh module, and following is the latest benchamrk:
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
Benchmark Mode Cnt Score Error Units
ConsumerRecordsBenchmark.iteratorRecords avgt 10 553053.483 ± 13412.339 ns/op
ConsumerRecordsBenchmark.iteratorRecordsByLegacyImplementation avgt 10 498448.180 ± 37297.950 ns/op
ConsumerRecordsBenchmark.records avgt 10 1.120 ± 0.004 ns/op
ConsumerRecordsBenchmark.recordsWithLegacyImplementation avgt 10 61.529 ± 0.391 ns/op
JMH benchmarks done
The benchmark of recent commits:
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
Benchmark Mode Cnt Score Error Units
ConsumerRecordsBenchmark.iteratorRecords avgt 10 544914.518 ± 18116.955 ns/op
ConsumerRecordsBenchmark.iteratorRecordsByLegacyImplementation avgt 10 536728.635 ± 10104.066 ns/op
ConsumerRecordsBenchmark.records avgt 10 1.116 ± 0.003 ns/op
ConsumerRecordsBenchmark.recordsWithLegacyImplementation avgt 10 60.926 ± 0.027 ns/op
JMH benchmarks done
I have increased the number of warmup iterations to make the benchmark results more stable.
# JMH version: 1.37
# VM version: JDK 17.0.11, OpenJDK 64-Bit Server VM, 17.0.11+9-LTS
# VM options: <none>
# Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 10 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
Benchmark Mode Cnt Score Error Units
ConsumerRecordsBenchmark.iteratorRecords avgt 10 553151.074 ± 2273.429 ns/op
ConsumerRecordsBenchmark.iteratorRecordsByLegacyImplementation avgt 10 566207.722 ± 12791.416 ns/op
ConsumerRecordsBenchmark.records avgt 10 1.117 ± 0.002 ns/op
ConsumerRecordsBenchmark.recordsWithLegacyImplementation avgt 10 61.072 ± 0.026 ns/op
@frankvicky Could you please revise the topic ?
@frankvicky could you please rebase code to run CI again?
@dajac Do you have free cycle to take a look at this PR? It brings a bit performance improvement when the ConsumerRecords have a bunch of partitions.
Hi @chia7712 I have merged the latest trunk into it.