druid
druid copied to clipboard
specialized FixedIndexed implementations for java value types
Description
Adds specialized implementations for java long, double, and int value type implementations of FixedIndexed
, which is used by the nested data columns added in #12753.
While not entirely attributable to this PR (the range filtering tests owe that to #12830), repeating the benchmarks done in show improvement:
SELECT SUM(long1) FROM foo
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 0 5000000 false avgt 5 36.711 ± 0.917 ms/op
SqlNestedDataBenchmark.querySql 0 5000000 force avgt 5 15.587 ± 0.276 ms/op
SqlNestedDataBenchmark.querySql 1 5000000 false avgt 5 39.224 ± 0.870 ms/op
SqlNestedDataBenchmark.querySql 1 5000000 force avgt 5 15.877 ± 0.440 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 0 5000000 false avgt 5 37.672 ± 1.136 ms/op
SqlNestedDataBenchmark.querySql 0 5000000 force avgt 5 15.687 ± 0.497 ms/op
SqlNestedDataBenchmark.querySql 1 5000000 false avgt 5 39.437 ± 0.690 ms/op
SqlNestedDataBenchmark.querySql 1 5000000 force avgt 5 15.978 ± 0.587 ms/op
SELECT SUM(long1), SUM(long2) FROM foo
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)), SUM(JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT)) FROM foo
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 2 5000000 false avgt 5 63.805 ± 1.036 ms/op
SqlNestedDataBenchmark.querySql 2 5000000 force avgt 5 30.381 ± 1.201 ms/op
SqlNestedDataBenchmark.querySql 3 5000000 false avgt 5 66.660 ± 0.806 ms/op
SqlNestedDataBenchmark.querySql 3 5000000 force avgt 5 30.341 ± 1.124 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 2 5000000 false avgt 5 63.825 ± 1.228 ms/op
SqlNestedDataBenchmark.querySql 2 5000000 force avgt 5 30.955 ± 0.769 ms/op
SqlNestedDataBenchmark.querySql 3 5000000 false avgt 5 67.277 ± 0.928 ms/op
SqlNestedDataBenchmark.querySql 3 5000000 force avgt 5 30.860 ± 1.023 ms/op
SELECT SUM(long1), SUM(long2), SUM(double3) FROM foo
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)), SUM(JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT)), SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 4 5000000 false avgt 5 78.570 ± 1.657 ms/op
SqlNestedDataBenchmark.querySql 4 5000000 force avgt 5 37.777 ± 1.295 ms/op
SqlNestedDataBenchmark.querySql 5 5000000 false avgt 5 82.672 ± 1.010 ms/op
SqlNestedDataBenchmark.querySql 5 5000000 force avgt 5 37.887 ± 0.802 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 4 5000000 false avgt 5 80.173 ± 1.342 ms/op
SqlNestedDataBenchmark.querySql 4 5000000 force avgt 5 38.272 ± 0.589 ms/op
SqlNestedDataBenchmark.querySql 5 5000000 false avgt 5 84.370 ± 1.275 ms/op
SqlNestedDataBenchmark.querySql 5 5000000 force avgt 5 38.541 ± 1.137 ms/op
SELECT string1, SUM(long1) FROM foo GROUP BY 1 ORDER BY 2,
SELECT JSON_VALUE(nested, '$.nesteder.string1'), SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo GROUP BY 1 ORDER BY 2,
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 6 5000000 false avgt 5 269.560 ± 1.454 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 force avgt 5 157.090 ± 4.058 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 false avgt 5 373.162 ± 2.871 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 force avgt 5 195.213 ± 1.993 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 6 5000000 false avgt 5 234.328 ± 5.505 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 force avgt 5 155.711 ± 5.286 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 false avgt 5 383.741 ± 4.796 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 force avgt 5 195.051 ± 6.225 ms/op
SELECT string1, SUM(long1), SUM(double3) FROM foo GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.nesteder.string1'), SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)), SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo GROUP BY 1 ORDER BY 2
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 8 5000000 false avgt 5 251.743 ± 6.438 ms/op
SqlNestedDataBenchmark.querySql 8 5000000 force avgt 5 172.322 ± 14.814 ms/op
SqlNestedDataBenchmark.querySql 9 5000000 false avgt 5 417.454 ± 21.276 ms/op
SqlNestedDataBenchmark.querySql 9 5000000 force avgt 5 215.228 ± 9.304 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 8 5000000 false avgt 5 302.745 ± 6.738 ms/op
SqlNestedDataBenchmark.querySql 8 5000000 force avgt 5 168.410 ± 1.750 ms/op
SqlNestedDataBenchmark.querySql 9 5000000 false avgt 5 459.633 ± 5.099 ms/op
SqlNestedDataBenchmark.querySql 9 5000000 force avgt 5 208.978 ± 1.130 ms/op
SELECT SUM(long1) FROM foo WHERE string1 = '10000' OR string1 = '1000'
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string1') = '10000' OR JSON_VALUE(nested, '$.nesteder.string1') = '1000'
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 10 5000000 false avgt 5 11.482 ± 0.495 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 force avgt 5 11.549 ± 0.303 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 false avgt 5 11.695 ± 0.293 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 force avgt 5 11.931 ± 0.338 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 10 5000000 false avgt 5 11.427 ± 0.480 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 force avgt 5 11.545 ± 0.431 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 false avgt 5 11.650 ± 0.520 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 force avgt 5 11.732 ± 0.406 ms/op
SELECT SUM(long1) FROM foo WHERE long2 = 10000 OR long2 = 1000
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) = 10000 OR JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) = 1000
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 12 5000000 false avgt 5 78.895 ± 2.158 ms/op
SqlNestedDataBenchmark.querySql 12 5000000 force avgt 5 48.814 ± 0.874 ms/op
SqlNestedDataBenchmark.querySql 13 5000000 false avgt 5 1.297 ± 0.008 ms/op
SqlNestedDataBenchmark.querySql 13 5000000 force avgt 5 1.277 ± 0.011 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 12 5000000 false avgt 5 77.838 ± 3.391 ms/op
SqlNestedDataBenchmark.querySql 12 5000000 force avgt 5 50.000 ± 1.199 ms/op
SqlNestedDataBenchmark.querySql 13 5000000 false avgt 5 1.096 ± 0.017 ms/op
SqlNestedDataBenchmark.querySql 13 5000000 force avgt 5 1.105 ± 0.023 ms/op
SELECT SUM(long1) FROM foo WHERE double3 < 10000.0 AND double3 > 1000.0
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 10000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 14 5000000 false avgt 5 92.982 ± 1.473 ms/op
SqlNestedDataBenchmark.querySql 14 5000000 force avgt 5 54.729 ± 0.429 ms/op
SqlNestedDataBenchmark.querySql 15 5000000 false avgt 5 580.472 ± 28.064 ms/op
SqlNestedDataBenchmark.querySql 15 5000000 force avgt 5 561.494 ± 54.096 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 14 5000000 false avgt 5 93.883 ± 4.995 ms/op
SqlNestedDataBenchmark.querySql 14 5000000 force avgt 5 52.985 ± 1.123 ms/op
SqlNestedDataBenchmark.querySql 15 5000000 false avgt 5 228.775 ± 3.131 ms/op
SqlNestedDataBenchmark.querySql 15 5000000 force avgt 5 216.295 ± 2.309 ms/op
SELECT long1, SUM(double3) FROM foo WHERE string1 = '10000' OR string1 = '1000' GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.long1' RETURNING BIGINT), SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.string1') = '10000' OR JSON_VALUE(nested, '$.nesteder.string1') = '1000' GROUP BY 1 ORDER BY 2
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 16 5000000 false avgt 5 129.760 ± 9.953 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 force avgt 5 133.015 ± 20.961 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 false avgt 5 142.197 ± 8.773 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 force avgt 5 132.048 ± 15.546 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 16 5000000 false avgt 5 125.571 ± 3.131 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 force avgt 5 125.625 ± 4.528 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 false avgt 5 125.689 ± 2.543 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 force avgt 5 126.233 ± 4.543 ms/op
SELECT string1, SUM(double3) FROM foo WHERE long2 < 10000 AND long2 > 1000 GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.nesteder.string1'), SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) < 10000 AND JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) > 1000 GROUP BY 1 ORDER BY 2
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 18 5000000 false avgt 5 161.445 ± 7.024 ms/op
SqlNestedDataBenchmark.querySql 18 5000000 force avgt 5 138.212 ± 19.673 ms/op
SqlNestedDataBenchmark.querySql 19 5000000 false avgt 5 123.486 ± 5.029 ms/op
SqlNestedDataBenchmark.querySql 19 5000000 force avgt 5 120.079 ± 6.822 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 18 5000000 false avgt 5 149.483 ± 4.772 ms/op
SqlNestedDataBenchmark.querySql 18 5000000 force avgt 5 128.978 ± 4.393 ms/op
SqlNestedDataBenchmark.querySql 19 5000000 false avgt 5 114.389 ± 4.373 ms/op
SqlNestedDataBenchmark.querySql 19 5000000 force avgt 5 114.224 ± 3.520 ms/op
SELECT string1, SUM(double3) FROM foo WHERE double3 < 10000.0 AND double3 > 1000.0 GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.nesteder.string1'), SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 10000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0 GROUP BY 1 ORDER BY 2
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 20 5000000 false avgt 5 280.393 ± 15.369 ms/op
SqlNestedDataBenchmark.querySql 20 5000000 force avgt 5 174.545 ± 2.702 ms/op
SqlNestedDataBenchmark.querySql 21 5000000 false avgt 5 802.647 ± 32.078 ms/op
SqlNestedDataBenchmark.querySql 21 5000000 force avgt 5 591.274 ± 16.460 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 20 5000000 false avgt 5 264.055 ± 4.947 ms/op
SqlNestedDataBenchmark.querySql 20 5000000 force avgt 5 172.410 ± 3.586 ms/op
SqlNestedDataBenchmark.querySql 21 5000000 false avgt 5 454.973 ± 5.799 ms/op
SqlNestedDataBenchmark.querySql 21 5000000 force avgt 5 359.288 ± 5.179 ms/op
SELECT long2 FROM foo WHERE long2 IN (1, 19, 21, 23, 25, 26, 46),
SELECT JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) IN (1, 19, 21, 23, 25, 26, 46),
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 22 5000000 false avgt 5 273.464 ± 15.731 ms/op
SqlNestedDataBenchmark.querySql 22 5000000 force avgt 5 272.270 ± 20.511 ms/op
SqlNestedDataBenchmark.querySql 23 5000000 false avgt 5 174.960 ± 1.923 ms/op
SqlNestedDataBenchmark.querySql 23 5000000 force avgt 5 177.920 ± 4.095 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 22 5000000 false avgt 5 283.230 ± 6.222 ms/op
SqlNestedDataBenchmark.querySql 22 5000000 force avgt 5 282.467 ± 5.709 ms/op
SqlNestedDataBenchmark.querySql 23 5000000 false avgt 5 177.047 ± 4.990 ms/op
SqlNestedDataBenchmark.querySql 23 5000000 force avgt 5 172.208 ± 1.031 ms/op
SELECT long2 FROM foo WHERE long2 IN (1, 19, 21, 23, 25, 26, 46) GROUP BY 1",
SELECT JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.long2' RETURNING BIGINT) IN (1, 19, 21, 23, 25, 26, 46) GROUP BY 1
old:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 24 5000000 false avgt 5 318.280 ± 7.544 ms/op
SqlNestedDataBenchmark.querySql 24 5000000 force avgt 5 210.866 ± 14.684 ms/op
SqlNestedDataBenchmark.querySql 25 5000000 false avgt 5 215.200 ± 2.366 ms/op
SqlNestedDataBenchmark.querySql 25 5000000 force avgt 5 152.399 ± 22.695 ms/op
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 24 5000000 false avgt 5 308.668 ± 7.274 ms/op
SqlNestedDataBenchmark.querySql 24 5000000 force avgt 5 199.587 ± 3.162 ms/op
SqlNestedDataBenchmark.querySql 25 5000000 false avgt 5 212.399 ± 4.406 ms/op
SqlNestedDataBenchmark.querySql 25 5000000 force avgt 5 149.650 ± 3.489 ms/op
SELECT SUM(long1) FROM foo WHERE double3 < 1005.0 AND double3 > 1000.0
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 1005.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0
old:
(not previously measured)
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 26 5000000 false avgt 5 74.495 ± 1.506 ms/op
SqlNestedDataBenchmark.querySql 26 5000000 force avgt 5 48.270 ± 0.806 ms/op
SqlNestedDataBenchmark.querySql 27 5000000 false avgt 5 12.997 ± 0.509 ms/op
SqlNestedDataBenchmark.querySql 27 5000000 force avgt 5 13.094 ± 0.553 ms/op
SELECT SUM(long1) FROM foo WHERE double3 < 2000.0 AND double3 > 1000.0
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) < 2000.0 AND JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE) > 1000.0
old:
(not previously measured)
new:
Benchmark (query) (rowsPerSegment) (vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 28 5000000 false avgt 5 79.335 ± 2.919 ms/op
SqlNestedDataBenchmark.querySql 28 5000000 force avgt 5 51.953 ± 1.056 ms/op
SqlNestedDataBenchmark.querySql 29 5000000 false avgt 5 40.987 ± 0.710 ms/op
SqlNestedDataBenchmark.querySql 29 5000000 force avgt 5 40.654 ± 0.713 ms/op
This PR has:
- [ ] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.