LocustDB
LocustDB copied to clipboard
Ordering by grouping column performs unnecessary work
Note this portion of the query plan which combines passenger_count with itself:
casted_1 = column_0 as I64 TypeConversionOperator<u8, i64>
bitpacked_2 = casted_1 + (casted_1 << $shift) ParameterizedVecVecIntegerOperator<BitShiftLeftAdd>
This causes a roughly 3x slowdown (on this query). When creating the grouping key, each unique expression should just be used once.
Full query plan:
locustdb> :explain SELECT passenger_count, count(0) FROM trips ORDER BY passenger_count LIMIT 100;
Query plan in 22412 batches
-- Stage 0 (streaming) --
column_0 = "passenger_count".0 ReadColumnData
casted_1 = column_0 as I64 TypeConversionOperator<u8, i64>
bitpacked_2 = casted_1 + (casted_1 << $shift) ParameterizedVecVecIntegerOperator<BitShiftLeftAdd>
constant_3 = Constant<Integer> Constant
count_4[bitpacked_2] += 1 VecCount<i64>
-- Stage 1 --
nonzero_indices_5 = nonzero_indices(count_4) NonzeroIndices<u32, i64>
-- Stage 2 --
= count_4[count_4 > 0] NonzeroCompact<u32>
-- Stage 3 --
casted_6 = count_4 as I64 TypeConversionOperator<u32, i64>
-- Stage 4 --
unpacked_7 = (nonzero_indices_5 >> $shift) & $mask BitUnpackOperator
casted_8 = unpacked_7 as U8 TypeConversionOperator<i64, u8>
casted_9 = casted_8 as I64 TypeConversionOperator<u8, i64>
-- Stage 5 --
unpacked_10 = (nonzero_indices_5 >> $shift) & $mask BitUnpackOperator
casted_11 = unpacked_10 as U8 TypeConversionOperator<i64, u8>
casted_12 = casted_11 as I64 TypeConversionOperator<u8, i64>