bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

Update datasketches version from 0.8.3 to 3.2.0

Open hezhangjian opened this issue 3 years ago • 4 comments

Motivation

sketches-core 0.8.3 is released in 2016. Keep update. And it's now transfer to an apache project

Changes

update sketches-core to latest apache version

Performance test (using jmh)

Performance tests doesn't show any performance regression

0.8.3

Iteration   1: 22440898.750 ops/s
Iteration   2: 22446008.035 ops/s
Iteration   3: 22448625.120 ops/s
Iteration   4: 21927593.096 ops/s
Iteration   5: 21975718.907 ops/s

3.2.0

Iteration   1: 21653606.740 ops/s
Iteration   2: 21570153.077 ops/s
Iteration   3: 21117009.634 ops/s
Iteration   4: 22191429.289 ops/s
Iteration   5: 22220934.512 ops/s

hezhangjian avatar May 09 '22 11:05 hezhangjian

@dlg99 @nicoloboschi @eolivelli @merlimat PTAL

hezhangjian avatar May 11 '22 07:05 hezhangjian

@Shoothzj Can you also check the number of allocations? Using -prof gc.

I remember in earlier versions of datasketches (after 0.8.3) they had introduced some regression that added lot of heap allocations.

merlimat avatar May 11 '22 17:05 merlimat

@Shoothzj Just did a quick test. There are 4 bytes per each recorded sample with the new Datasketches. It would be good to understand why that is the case and if there's any way to configure DataSketches to avoid that.

DataSketches 0.8.3
Benchmark                                                           (statsProvider)   Mode  Cnt     Score     Error   Units
StatsLoggerBenchmark.recordLatency                                       Prometheus  thrpt    3    15.203 ±   2.787  ops/us
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate                        Prometheus  thrpt    3     0.023 ±   0.368  MB/sec
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate.norm                   Prometheus  thrpt    3     0.002 ±   0.027    B/op
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space               Prometheus  thrpt    3     1.603 ±  50.660  MB/sec
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space.norm          Prometheus  thrpt    3     0.116 ±   3.650    B/op
StatsLoggerBenchmark.recordLatency:·gc.count                             Prometheus  thrpt    3     1.000            counts
StatsLoggerBenchmark.recordLatency:·gc.time                              Prometheus  thrpt    3     2.000                ms
StatsLoggerBenchmark.recordLatency:·stack                                Prometheus  thrpt            NaN               ---


DataSketches 3.2.0
Benchmark                                                           (statsProvider)   Mode  Cnt     Score     Error   Units
StatsLoggerBenchmark.recordLatency                                       Prometheus  thrpt    3    15.965 ±   9.438  ops/us
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate                        Prometheus  thrpt    3    63.314 ±  35.780  MB/sec
StatsLoggerBenchmark.recordLatency:·gc.alloc.rate.norm                   Prometheus  thrpt    3     4.377 ±   0.023    B/op
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space               Prometheus  thrpt    3    57.793 ±   4.866  MB/sec
StatsLoggerBenchmark.recordLatency:·gc.churn.G1_Eden_Space.norm          Prometheus  thrpt    3     3.998 ±   2.456    B/op
StatsLoggerBenchmark.recordLatency:·gc.count                             Prometheus  thrpt    3     3.000            counts
StatsLoggerBenchmark.recordLatency:·gc.time                              Prometheus  thrpt    3     4.000                ms
StatsLoggerBenchmark.recordLatency:·stack                                Prometheus  thrpt            NaN               ---

merlimat avatar Jun 06 '22 17:06 merlimat

@merlimat Sorry for my late reply, I also tested 27 days before. Also open an issue in https://github.com/apache/datasketches-java/issues/398 If we are sensitive with performance, we need to keep version on 0.8.3?

hezhangjian avatar Jun 07 '22 12:06 hezhangjian

fix old workflow,please see #3455 for detail

StevenLuMT avatar Aug 24 '22 08:08 StevenLuMT

The issue has been closed https://github.com/apache/datasketches-java/issues/398

eolivelli avatar Mar 14 '23 08:03 eolivelli