velox
velox copied to clipboard
Adding decimal support for min() and max() functions
Delivers https://github.com/facebookincubator/velox/issues/9004
This PR is for adding decimal support for min() and max() functions.
presto-cli output:
presto:tpch> SELECT MIN(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
_col0
------------------
[0.8200, 2.3330]
(1 row)
presto:tpch> SELECT MIN(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
_col0
--------------------------
[0.8200, 2.3330, 3.1320]
(1 row)
presto:tpch> SELECT MAX(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
_col0
------------------
[4.3440, 3.1320]
(1 row)
presto:tpch> SELECT MAX(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
_col0
--------------------------
[4.3440, 3.1320, 2.3330]
(1 row)
Deploy Preview for meta-velox canceled.
Name | Link |
---|---|
Latest commit | 8abad4896ee1465b526b3337406adaccfc4c0fb3 |
Latest deploy log | https://app.netlify.com/sites/meta-velox/deploys/66280a4fae26f100087498f1 |
@minhancao The linux-adapters job failed. This is likely an alignment issue.
int32_t accumulatorAlignmentSize() const override {
return 1;
}
Please override this method:
int32_t accumulatorAlignmentSize() const override {
return 1; --> must return size of int128_t in case of long decimals.
}
@karteekmurthys I see this piece of code on line 99 of MinMaxAggregates.cpp, would this already have taken care of it?
/// Override 'accumulatorAlignmentSize' for UnscaledLongDecimal values as it
/// uses int128_t type. Some CPUs don't support misaligned access to int128_t
/// type.
template <>
inline int32_t MinMaxAggregate<int128_t>::accumulatorAlignmentSize() const {
return static_cast<int32_t>(sizeof(int128_t));
}
Also, the accumulator internally uses a heap for sorting. This heap allocator must be an aligned allocator:
template <typename T, typename Compare>
struct MinMaxNAccumulator {
int64_t n{0};
std::vector<T, StlAllocator<T>> heapValues;
explicit MinMaxNAccumulator(HashStringAllocator* allocator)
: heapValues{StlAllocator<T>(allocator)} {}
int64_t getN() const {
return n;
}
size_t size() const {
return heapValues.size();
}
Refer this:https://github.com/facebookincubator/velox/pull/8723/files#diff-5091b1dd8232c357067be47fb0e751772f91a0aa12dea5db835a2d4ae296a63cR93
Hi @karteekmurthys, I have made the change that you have requested, please review when you can. thanks!
@minhancao : Looks good. Thanks for contributing !
@Yuhta would you please review this?
Hi @Yuhta , can you review this PR when you can please? Thanks!
@Yuhta I have ran locally the velox aggregation fuzzer test: min() function: command ran:
_build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test --seed 123456 --duration_sec 1200 --logtostderr=1 --minloglevel=0 --only min --v=1 2>&1 | tee agg_fuzzer_test_min_20mins.out
output:
I20240422 15:00:08.498713 3874823 AggregationFuzzer.cpp:456] ==============================> Done with iteration 442
I20240422 15:00:08.498742 3874823 AggregationFuzzer.cpp:993] Total masked aggregations: 67 (15.12%)
I20240422 15:00:08.498764 3874823 AggregationFuzzer.cpp:995] Total global aggregations: 33 (7.45%)
I20240422 15:00:08.498781 3874823 AggregationFuzzer.cpp:997] Total group-by aggregations: 325 (73.36%)
I20240422 15:00:08.498798 3874823 AggregationFuzzer.cpp:999] Total distinct aggregations: 46 (10.38%)
I20240422 15:00:08.498817 3874823 AggregationFuzzer.cpp:1001] Total aggregations over distinct inputs: 69 (15.58%)
I20240422 15:00:08.498834 3874823 AggregationFuzzer.cpp:1003] Total window expressions: 39 (8.80%)
I20240422 15:00:08.498852 3874823 AggregationFuzzerBase.cpp:647] Total functions tested: 1
I20240422 15:00:08.498868 3874823 AggregationFuzzerBase.cpp:648] Total iterations requiring sorted inputs: 71 (16.03%)
I20240422 15:00:08.498883 3874823 AggregationFuzzerBase.cpp:650] Total iterations verified against reference DB: 41 (9.26%)
I20240422 15:00:08.498899 3874823 AggregationFuzzerBase.cpp:652] Total functions not verified (verification skipped / not supported by reference DB / reference DB failed): 39 (8.80%) / 209 (47.18%) / 125 (28.22%)
I20240422 15:00:08.498917 3874823 AggregationFuzzerBase.cpp:657] Total failed functions: 32 (7.22%)
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[ PASSED ] 0 tests.
max() function: command ran:
_build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test --seed 123456 --duration_sec 1200 --logtostderr=1 --minloglevel=0 --only max --v=1 2>&1 | tee agg_fuzzer_test_max_20mins.out
output:
I20240422 15:59:21.033170 3979822 AggregationFuzzer.cpp:456] ==============================> Done with iteration 451
I20240422 15:59:21.033193 3979822 AggregationFuzzer.cpp:993] Total masked aggregations: 68 (15.04%)
I20240422 15:59:21.033223 3979822 AggregationFuzzer.cpp:995] Total global aggregations: 34 (7.52%)
I20240422 15:59:21.033241 3979822 AggregationFuzzer.cpp:997] Total group-by aggregations: 333 (73.67%)
I20240422 15:59:21.033258 3979822 AggregationFuzzer.cpp:999] Total distinct aggregations: 46 (10.18%)
I20240422 15:59:21.033274 3979822 AggregationFuzzer.cpp:1001] Total aggregations over distinct inputs: 70 (15.49%)
I20240422 15:59:21.033290 3979822 AggregationFuzzer.cpp:1003] Total window expressions: 39 (8.63%)
I20240422 15:59:21.033306 3979822 AggregationFuzzerBase.cpp:647] Total functions tested: 1
I20240422 15:59:21.033322 3979822 AggregationFuzzerBase.cpp:648] Total iterations requiring sorted inputs: 74 (16.37%)
I20240422 15:59:21.033339 3979822 AggregationFuzzerBase.cpp:650] Total iterations verified against reference DB: 41 (9.07%)
I20240422 15:59:21.033355 3979822 AggregationFuzzerBase.cpp:652] Total functions not verified (verification skipped / not supported by reference DB / reference DB failed): 39 (8.63%) / 217 (48.01%) / 126 (27.88%)
I20240422 15:59:21.033372 3979822 AggregationFuzzerBase.cpp:657] Total failed functions: 32 (7.08%)
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[ PASSED ] 0 tests.
The branch seems messed up
@Yuhta Fixed the branch
@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@Yuhta merged this pull request in facebookincubator/velox@d97ddb410c8f3a957425bbfd91edbe7f016300de.
Conbench analyzed the 1 benchmark run on commit d97ddb41
.
There were no benchmark performance regressions. 🎉
The full Conbench report has more details.