Update datasketches version from 0.8.3 to 3.2.0
Motivation
sketches-core 0.8.3 is released in 2016. Keep update. And it's now transfer to an apache project
Changes
update sketches-core to latest apache version
Performance test (using jmh)
Performance tests doesn't show any performance regression
0.8.3
Iteration 1: 22440898.750 ops/s
Iteration 2: 22446008.035 ops/s
Iteration 3: 22448625.120 ops/s
Iteration 4: 21927593.096 ops/s
Iteration 5: 21975718.907 ops/s
3.2.0
Iteration 1: 21653606.740 ops/s
Iteration 2: 21570153.077 ops/s
Iteration 3: 21117009.634 ops/s
Iteration 4: 22191429.289 ops/s
Iteration 5: 22220934.512 ops/s
@dlg99 @nicoloboschi @eolivelli @merlimat PTAL
@Shoothzj Can you also check the number of allocations? Using -prof gc.
I remember in earlier versions of datasketches (after 0.8.3) they had introduced some regression that added lot of heap allocations.
@Shoothzj Just did a quick test. There are 4 bytes per each recorded sample with the new Datasketches. It would be good to understand why that is the case and if there's any way to configure DataSketches to avoid that.
DataSketches 0.8.3
Benchmark (statsProvider) Mode Cnt Score Error Units
StatsLoggerBenchmark.recordLatency Prometheus thrpt 3 15.203 ± 2.787 ops/us
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate Prometheus thrpt 3 0.023 ± 0.368 MB/sec
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate.norm Prometheus thrpt 3 0.002 ± 0.027 B/op
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space Prometheus thrpt 3 1.603 ± 50.660 MB/sec
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space.norm Prometheus thrpt 3 0.116 ± 3.650 B/op
StatsLoggerBenchmark.recordLatency:·gc.count Prometheus thrpt 3 1.000 counts
StatsLoggerBenchmark.recordLatency:·gc.time Prometheus thrpt 3 2.000 ms
StatsLoggerBenchmark.recordLatency:·stack Prometheus thrpt NaN ---
DataSketches 3.2.0
Benchmark (statsProvider) Mode Cnt Score Error Units
StatsLoggerBenchmark.recordLatency Prometheus thrpt 3 15.965 ± 9.438 ops/us
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate Prometheus thrpt 3 63.314 ± 35.780 MB/sec
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate.norm Prometheus thrpt 3 4.377 ± 0.023 B/op
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space Prometheus thrpt 3 57.793 ± 4.866 MB/sec
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space.norm Prometheus thrpt 3 3.998 ± 2.456 B/op
StatsLoggerBenchmark.recordLatency:·gc.count Prometheus thrpt 3 3.000 counts
StatsLoggerBenchmark.recordLatency:·gc.time Prometheus thrpt 3 4.000 ms
StatsLoggerBenchmark.recordLatency:·stack Prometheus thrpt NaN ---
@merlimat Sorry for my late reply, I also tested 27 days before. Also open an issue in https://github.com/apache/datasketches-java/issues/398
If we are sensitive with performance, we need to keep version on 0.8.3?
fix old workflow,please see #3455 for detail
The issue has been closed https://github.com/apache/datasketches-java/issues/398