scala-hashing
scala-hashing copied to clipboard
Investigate Performance Regression with Scala 2.13.0
bench/jmh:run -i 3 -wi 3 -f1 XxHash64Bench.com_desmondyeung_hashing
2.12.9
[info] Benchmark (inputSize) Mode Cnt Score Error Units
[info] XxHash64Bench.com_desmondyeung_hashing 8 thrpt 3 185778619.019 ± 1934781.573 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 128 thrpt 3 56514066.010 ± 634268.161 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 512 thrpt 3 23600141.441 ± 810006.785 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1024 thrpt 3 13299685.393 ± 717831.068 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1536 thrpt 3 9246722.872 ± 260972.021 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 2048 thrpt 3 7056145.724 ± 265236.056 ops/s
[success] Total time: 184 s, completed Aug 21, 2019, 3:15:13 PM
2.13.0
[info] Benchmark (inputSize) Mode Cnt Score Error Units
[info] XxHash64Bench.com_desmondyeung_hashing 8 thrpt 3 184298585.739 ± 2062166.900 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 128 thrpt 3 52045201.983 ± 3229937.845 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 512 thrpt 3 21511399.488 ± 311140.849 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1024 thrpt 3 11933307.339 ± 745263.751 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1536 thrpt 3 8226233.328 ± 404379.004 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 2048 thrpt 3 6321480.430 ± 312725.166 ops/s
[success] Total time: 190 s, completed Aug 21, 2019, 3:11:30 PM
@desmondyeung have you considered to inline all prime constants manually?
@plokhotnyuk yes, I had originally not inlined them with 2.12.9 because I actually found that it made performance much worse. It's seems that inlining them with 2.13 does help with larger input, but it's still slower than 2.12.9 without inlining.
2.12.9 inlined prime constants
[info] Benchmark (inputSize) Mode Cnt Score Error Units
[info] XxHash64Bench.com_desmondyeung_hashing 8 thrpt 3 166010140.218 ± 13125301.638 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 128 thrpt 3 49034132.659 ± 933867.058 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 512 thrpt 3 19074074.413 ± 1103220.964 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1024 thrpt 3 10842131.559 ± 757732.072 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1536 thrpt 3 7495990.225 ± 2007210.588 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 2048 thrpt 3 5684602.006 ± 105917.575 ops/s
[success] Total time: 118 s, completed Aug 23, 2019, 3:24:31 PM
2.13.0 inlined prime constants
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark (inputSize) Mode Cnt Score Error Units
[info] XxHash64Bench.com_desmondyeung_hashing 8 thrpt 3 156177235.277 ± 7133756.881 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 128 thrpt 3 51057573.056 ± 1136897.911 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 512 thrpt 3 22916472.679 ± 1689274.899 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1024 thrpt 3 12553725.655 ± 881055.406 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 1536 thrpt 3 9083090.027 ± 496895.445 ops/s
[info] XxHash64Bench.com_desmondyeung_hashing 2048 thrpt 3 7008018.944 ± 261122.178 ops/s
[success] Total time: 118 s, completed Aug 23, 2019, 3:13:59 PM
with final prime constants:
public final long hashBytes(byte[], long, int, long);
Code:
0: lload_2
1: lstore 9
3: iload 4
5: istore 11
7: iload 4
9: bipush 32
11: if_icmplt 359
14: lload 5
16: ldc2_w #48 // long -7046029288634856825l
19: ladd
20: ldc2_w #51 // long -4417276706812531889l
23: ladd
24: lstore 12
26: lload 5
28: ldc2_w #51 // long -4417276706812531889l
31: ladd
32: lstore 14
34: lload 5
36: lstore 16
38: lload 5
40: ldc2_w #48 // long -7046029288634856825l
with non-final prime constants
public final long hashBytes(byte[], long, int, long);
Code:
0: lload_2
1: lstore 9
3: iload 4
5: istore 11
7: iload 4
9: bipush 32
11: if_icmplt 227
14: lload 5
16: aload_0
17: invokevirtual #111 // Method Prime1:()J
20: ladd
21: aload_0
22: invokevirtual #104 // Method Prime2:()J
25: ladd
26: lstore 12
28: lload 5
30: aload_0
31: invokevirtual #104 // Method Prime2:()J
34: ladd
35: lstore 14
37: lload 5
39: lstore 16
41: lload 5
43: aload_0
44: invokevirtual #111 // Method Prime1:()J