GH-41301: [C++] Extract the kernel loops used for PrimitiveTakeExec and generalize to any fixed-width type
Rationale for this change
I want to instantiate this primitive operation in other scenarios (e.g. the optimized version of Take that handles chunked arrays) and extend the sub-classes of GatherCRTP with different member functions that re-use the WriteValue function generically (any fixed-width type and even bit-wide booleans).
When taking these improvements to Filter I will also re-use the "gather" concept and parameterize it by bitmaps/boolean-arrays instead of selection vectors (indices) like take does. So gather is not a "renaming of take" but rather a generalization of take and filter do in Arrow with different representations of what should be gathered from the values array.
What changes are included in this PR?
- Introduce the Gather class helper to delegate fixed-width memory gathering: both static and dynamically sized (size known at compile time or size known at runtime)
- Specialized
Takeimplementation for values/indices without nulls - Fold the Boolean, Primitives, and Fixed-Width Binary implementation of
Takeinto a single one - Skip validity bitmap allocation when inputs (values and indices) have no nulls
Are these changes tested?
- Existing tests
- New test assertions that check that
Takeguarantees null values are zeroed out
- GitHub Issue: #41301
I'm skeptical that you want to reuse this for Filter, unless you add Gather methods for batch selection. For Filter performance, it is essential to write out ranges of selected values at a time, not one value at a time. I don't know if that's what you have in mind.
I want to expand the set of WriteValue implementations to support writing multiple values. Then add a version of Gather::Execute*() that takes boolean arrays (masks) instead of indices (selection vectors).
Proof that this PR reduces binary size:
~/code/arrow/cpp/release $ wc -c ./src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
278808 ./src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
$ git co origin/main
HEAD is now at 7f0c4070dd GH-41397: [C#] Downgrade macOS test runner to avoid infrastructure bug (#41934)
~/code/arrow/cpp/release $ ninja ./src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
[1/1] Building CXX object src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
~/code/arrow/cpp/release $ wc -c ./src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
289832 ./src/arrow/CMakeFiles/arrow_compute.dir/compute/kernels/vector_selection_internal.cc.o
-11024 bytes 🎉
Proof that this PR reduces binary size:
Nice! Can you also post numbers obtained with the size utility for completeness?
Proof that this PR reduces binary size:
Nice! Can you also post numbers obtained with the
sizeutility for completeness?
Looking at both vector_selection_take_internal.cc.o and vector_selection_internal.cc.o I'm net-adding 1.45 KBytes.
bloaty -d symbols -C full -n 0 \
HEAD-vector_selection_take_internal.cc.o HEAD-vector_selection_internal.cc.o -- \
MAIN-vector_selection_take_internal.cc.o MAIN-vector_selection_internal.cc.o
FILE SIZE VM SIZE
-------------- --------------
[NEW] +32.8Ki [NEW] +32.7Ki arrow::compute::internal::FixedWidthTakeExec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*)
[NEW] +204 [NEW] +168 GCC_except_table272
[NEW] +189 [NEW] +104 arrow::Status arrow::Status::NotImplemented<char const (&) [38], arrow::DataType const&>(char const (&&&) [38], arrow::DataType const&&&)
Details
+183% +168 +300% +168 GCC_except_table170
+43% +164 +47% +164 GCC_except_table20
[NEW] +156 [NEW] +120 GCC_except_table302
[NEW] +146 [NEW] +76 GCC_except_table99
+40% +109 +71% +144 GCC_except_table24
+72% +143 +66% gg +108 GCC_except_table13
[NEW] +126 [NEW] +56 GCC_except_table60
[NEW] +108 [NEW] +72 GCC_except_table138
+169% +108 +257% +72 GCC_except_table172
[NEW] +108 [NEW] +72 GCC_except_table225
[NEW] +107 [NEW] +72 GCC_except_table64
+195% +107 +360% +72 GCC_except_table87
[NEW] +96 [NEW] +60 GCC_except_table111
+160% +96 +250% +60 GCC_except_table151
+25% +95 +18% +60 GCC_except_table17
[NEW] +95 [NEW] +60 GCC_except_table72
[NEW] +95 [NEW] +60 GCC_except_table81
+116% +88 +220% +88 GCC_except_table192
[NEW] +84 [NEW] +48 GCC_except_table188
[NEW] +76 [NEW] +40 GCC_except_table107
+96% +68 +189% +68 GCC_except_table10
[NEW] +68 [NEW] +32 GCC_except_table102
[NEW] +68 [NEW] +32 GCC_except_table114
[NEW] +68 [NEW] +32 GCC_except_table144
[NEW] +68 [NEW] +32 GCC_except_table197
[NEW] +68 [NEW] +32 GCC_except_table223
[NEW] +67 [NEW] +32 GCC_except_table43
[NEW] +67 [NEW] +32 GCC_except_table53
[NEW] +67 [NEW] +32 GCC_except_table90
[NEW] +64 [NEW] +28 GCC_except_table164
[NEW] +64 [NEW] +28 GCC_except_table199
[NEW] +64 [NEW] +28 GCC_except_table207
[NEW] +64 [NEW] +28 GCC_except_table213
[NEW] +63 [NEW] +28 GCC_except_table28
[NEW] +60 [NEW] +24 GCC_except_table128
[NEW] +60 [NEW] +24 GCC_except_table155
[NEW] +60 [NEW] +24 GCC_except_table298
[NEW] +59 [NEW] +24 GCC_except_table36
[NEW] +56 [NEW] +20 GCC_except_table182
[NEW] +56 [NEW] +20 GCC_except_table190
[NEW] +56 [NEW] +20 GCC_except_table209
[NEW] +52 [NEW] +16 GCC_except_table133
[NEW] +52 [NEW] +16 GCC_except_table134
[NEW] +52 [NEW] +16 GCC_except_table162
[NEW] +52 [NEW] +16 GCC_except_table168
[NEW] +52 [NEW] +16 GCC_except_table179
[NEW] +52 [NEW] +16 GCC_except_table194
+100% +52 +100% +16 GCC_except_table202
[NEW] +52 [NEW] +16 GCC_except_table218
[NEW] +52 [NEW] +16 GCC_except_table221
[NEW] +52 [NEW] +16 GCC_except_table277
[NEW] +52 [NEW] +16 GCC_except_table279
[NEW] +51 [NEW] +16 GCC_except_table38
[NEW] +50 [NEW] +16 GCC_except_table4
+92% +48 +300% +48 GCC_except_table132
[NEW] +47 [NEW] +12 GCC_except_table54
[NEW] +42 [NEW] +16 lCPI223_1
[NEW] +40 [NEW] +16 lJTI2_8
[NEW] +40 [NEW] +16 lJTI2_9
+25% +32 +53% +32 GCC_except_table84
+0.4% +32 +0.4% +32 ltmp9
+30% +31 -5.9% -4 GCC_except_table85
[NEW] +31 [NEW] +5 lJTI170_0
[NEW] +30 [NEW] +4 lJTI170_1
+44% +28 +100% +28 GCC_except_table171
-6.4% -7 +70% +28 GCC_except_table35
+32% +24 +60% +24 GCC_except_table177
+46% +24 +150% +24 GCC_except_table178
+38% +20 +125% +20 GCC_except_table217
+26% +15 +100% +16 GCC_except_table100
+19% +12 +43% +12 GCC_except_table206
+19% +12 +43% +12 GCC_except_table212
+14% +8 +40% +8 GCC_except_table181
+5.6% +4 +11% +4 GCC_except_table216
+2.9% +2 +7.4% +2 typeinfo name for arrow::StructArray
+1.2% +2 +2.6% +2 typeinfo name for std::__1::__shared_ptr_emplace<arrow::ChunkedArray, std::__1::allocator<arrow::ChunkedArray> >
-6.8% -4 -16.7% -4 GCC_except_table83
-13.3% -8 -33.3% -8 GCC_except_table149
-17.9% -12 -37.5% -12 GCC_except_table41
-20.7% -12 -50.0% -12 GCC_except_table8
-23.9% -16 -50.0% -16 GCC_except_table52
-26.3% -20 -50.0% -20 GCC_except_table215
[DEL] -30 [DEL] -4 lJTI169_1
-27.3% -30 -46.9% -30 lJTI2_0
[DEL] -31 [DEL] -5 lJTI169_0
+2.9% +3 -47.1% -32 GCC_except_table34
[DEL] -41 [DEL] -16 lJTI25_0
[DEL] -42 [DEL] -16 lCPI222_1
-40.7% -44 -61.1% -44 GCC_except_table173
[DEL] -46 [DEL] -12 GCC_except_table7
[DEL] -47 [DEL] -12 GCC_except_table55
-0.6% -48 -0.6% -48 arrow::compute::internal::FSLTakeExec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*)
[DEL] -51 [DEL] -16 GCC_except_table37
[DEL] -51 [DEL] -16 GCC_except_table51
[DEL] -51 [DEL] -16 GCC_except_table61
[DEL] -52 [DEL] -16 GCC_except_table135
[DEL] -52 [DEL] -16 GCC_except_table148
[DEL] -52 [DEL] -16 GCC_except_table161
[DEL] -52 [DEL] -16 GCC_except_table203
[DEL] -52 [DEL] -16 GCC_except_table285
[DEL] -52 [DEL] -16 GCC_except_table287
[DEL] -52 [DEL] -16 GCC_except_table288
[DEL] -55 [DEL] -20 GCC_except_table33
[DEL] -55 [DEL] -20 GCC_except_table40
[DEL] -56 [DEL] -20 GCC_except_table208
[DEL] -56 [DEL] -20 GCC_except_table214
-48.3% -58 -46.2% -24 GCC_except_table2
[DEL] -60 [DEL] -24 GCC_except_table127
[DEL] -60 [DEL] -24 GCC_except_table306
[DEL] -63 [DEL] -28 GCC_except_table29
[DEL] -64 [DEL] -28 GCC_except_table163
[DEL] -64 [DEL] -28 GCC_except_table180
[DEL] -68 [DEL] -32 GCC_except_table113
[DEL] -68 [DEL] -32 GCC_except_table145
-22.1% -72 -28.1% -72 GCC_except_table11
[DEL] -75 [DEL] -40 GCC_except_table59
[DEL] -76 [DEL] -40 GCC_except_table108
[DEL] -76 [DEL] -40 GCC_except_table205
[DEL] -76 [DEL] -40 GCC_except_table211
-61.3% -92 -79.3% -92 GCC_except_table9
-20.2% -95 -15.0% -60 GCC_except_table16
[DEL] -95 [DEL] -60 GCC_except_table71
[DEL] -95 [DEL] -60 GCC_except_table82
[DEL] -95 [DEL] -60 GCC_except_table98
[DEL] -96 [DEL] -60 GCC_except_table110
-61.5% -96 -71.4% -60 GCC_except_table152
-64.3% -99 -76.2% -64 GCC_except_table86
[DEL] -100 [DEL] -64 GCC_except_table131
[DEL] -100 [DEL] -64 GCC_except_table174
[DEL] -103 [DEL] -24 typeinfo for arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl
[DEL] -107 [DEL] -72 GCC_except_table65
[DEL] -108 [DEL] -72 GCC_except_table139
[DEL] -108 [DEL] -72 GCC_except_table226
[DEL] -113 [DEL] -28 arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl::Finish()
[DEL] -120 [DEL] -48 GCC_except_table222
[DEL] -127 [DEL] -48 vtable for arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl
[DEL] -132 [DEL] -60 GCC_except_table198
-24.1% -132 -25.8% -132 GCC_except_table21
[DEL] -133 [DEL] -8 arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>::Init()
[DEL] -134 [DEL] -64 GCC_except_table89
[DEL] -136 [DEL] -64 GCC_except_table101
-26.2% -136 -28.1% -136 GCC_except_table18
[DEL] -137 [DEL] -16 typeinfo for arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>
[DEL] -137 [DEL] -58 typeinfo name for arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl
[DEL] -140 [DEL] -68 GCC_except_table189
-64.8% -140 -72.2% -104 GCC_except_table193
[DEL] -149 [DEL] -96 arrow::FixedSizeBinaryArray::~FixedSizeBinaryArray()
-74.5% -152 -90.5% -152 GCC_except_table280
[DEL] -156 [DEL] -120 GCC_except_table310
[DEL] -169 [DEL] -48 vtable for arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>
-50.1% -208 -54.7% -208 GCC_except_table22
[DEL] -221 [DEL] -100 typeinfo name for arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>
[DEL] -252 [DEL] -8 arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>::~Selection()
-2.1% -256 -2.1% -256 ltmp5
[DEL] -312 [DEL] -240 GCC_except_table169
-82.5% -316 -90.8% -316 GCC_except_table25
-8.9% -352 -9.2% -352 arrow::compute::internal::PopulateTakeKernels(std::__1::vector<arrow::compute::internal::SelectionKernelData, std::__1::allocator<arrow::compute::internal::SelectionKernelData> >*)
[DEL] -464 [DEL] -304 arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl::~FSBSelectionImpl()
-0.8% -677 [ = ] 0 [Unmapped]
[DEL] -5.83Ki [DEL] -5.72Ki arrow::compute::internal::FSBTakeExec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*)
[DEL] -23.9Ki [DEL] -23.8Ki arrow::compute::internal::PrimitiveTakeExec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*)
-0.1% -504 +0.5% +1.45Ki TOTAL
UPDATE: Last 3 commits lead to:
-0.1% -528 +0.5% +1.21Ki TOTAL
I've run the Take micro-benchmarks locally with this (AMD Zen 2, gcc 12.3.0). The changes a bit all over the place and show that compilers are generally capricious and difficult to steer towards "optimal" code :-)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (86)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/9 423.000M items/sec 6.628G items/sec 1466.938 {'family_index': 4, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 565, 'byte_width': 9.0, 'null_percent': 100.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1/9 299.749M items/sec 731.380M items/sec 143.998 {'family_index': 16, 'per_family_instance_index': 7, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 396, 'byte_width': 9.0, 'null_percent': 100.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/1/9 137.090M items/sec 222.341M items/sec 62.185 {'family_index': 17, 'per_family_instance_index': 7, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 198, 'byte_width': 9.0, 'null_percent': 100.0}
TakeFixedSizeBinaryMonotonicIndices/524288/1/9 184.516M items/sec 297.998M items/sec 61.503 {'family_index': 5, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 249, 'byte_width': 9.0, 'null_percent': 100.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/0/9 192.469M items/sec 302.407M items/sec 57.120 {'family_index': 17, 'per_family_instance_index': 9, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 253, 'byte_width': 9.0, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/9 189.605M items/sec 297.548M items/sec 56.931 {'family_index': 3, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 255, 'byte_width': 9.0, 'null_percent': 100.0}
TakeFixedSizeBinaryMonotonicIndices/524288/0/9 259.447M items/sec 406.608M items/sec 56.721 {'family_index': 5, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 348, 'byte_width': 9.0, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/9 229.434M items/sec 327.106M items/sec 42.571 {'family_index': 4, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 291, 'byte_width': 9.0, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/9 230.187M items/sec 321.062M items/sec 39.479 {'family_index': 3, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 305, 'byte_width': 9.0, 'null_percent': 0.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/0/9 178.603M items/sec 243.824M items/sec 36.517 {'family_index': 15, 'per_family_instance_index': 9, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 242, 'byte_width': 9.0, 'null_percent': 0.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1/9 157.029M items/sec 213.477M items/sec 35.948 {'family_index': 15, 'per_family_instance_index': 7, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 208, 'byte_width': 9.0, 'null_percent': 100.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/0/9 181.385M items/sec 240.142M items/sec 32.393 {'family_index': 16, 'per_family_instance_index': 9, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 246, 'byte_width': 9.0, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/9 138.613M items/sec 178.185M items/sec 28.548 {'family_index': 3, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 188, 'byte_width': 9.0, 'null_percent': 0.1}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/9 129.101M items/sec 160.774M items/sec 24.533 {'family_index': 3, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 173, 'byte_width': 9.0, 'null_percent': 10.0}
TakeFixedSizeBinaryMonotonicIndices/524288/1000/9 185.774M items/sec 228.147M items/sec 22.809 {'family_index': 5, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 250, 'byte_width': 9.0, 'null_percent': 0.1}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/9 136.620M items/sec 167.200M items/sec 22.383 {'family_index': 4, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 184, 'byte_width': 9.0, 'null_percent': 0.1}
TakeFixedSizeBinaryMonotonicIndices/524288/10/9 162.159M items/sec 196.435M items/sec 21.138 {'family_index': 5, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 221, 'byte_width': 9.0, 'null_percent': 10.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/10/9 114.506M items/sec 137.125M items/sec 19.753 {'family_index': 15, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 153, 'byte_width': 9.0, 'null_percent': 10.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1000/9 119.955M items/sec 142.302M items/sec 18.629 {'family_index': 16, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 134, 'byte_width': 9.0, 'null_percent': 0.1}
TakeChunkedChunkedFSBMonotonicIndices/524288/1000/9 158.880M items/sec 188.351M items/sec 18.549 {'family_index': 17, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 212, 'byte_width': 9.0, 'null_percent': 0.1}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1000/9 123.316M items/sec 145.043M items/sec 17.619 {'family_index': 15, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 165, 'byte_width': 9.0, 'null_percent': 0.1}
TakeChunkedChunkedFSBMonotonicIndices/524288/10/9 140.400M items/sec 165.035M items/sec 17.547 {'family_index': 17, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 175, 'byte_width': 9.0, 'null_percent': 10.0}
TakeChunkedChunkedStringRandomIndicesNoNulls/524288/0 24.680M items/sec 28.793M items/sec 16.662 {'family_index': 18, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedStringRandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 38, 'null_percent': 0.0}
TakeFixedSizeBinaryMonotonicIndices/524288/2/9 126.148M items/sec 146.438M items/sec 16.084 {'family_index': 5, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 164, 'byte_width': 9.0, 'null_percent': 50.0}
TakeFSLInt64MonotonicIndices/524288/0 787.109M items/sec 913.254M items/sec 16.026 {'family_index': 8, 'per_family_instance_index': 4, 'run_name': 'TakeFSLInt64MonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1052, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/9 98.420M items/sec 113.492M items/sec 15.314 {'family_index': 3, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 133, 'byte_width': 9.0, 'null_percent': 50.0}
TakeChunkedChunkedInt64MonotonicIndices/524288/0 471.212M items/sec 537.976M items/sec 14.169 {'family_index': 14, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedInt64MonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 645, 'null_percent': 0.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/0/8 476.152M items/sec 543.321M items/sec 14.107 {'family_index': 17, 'per_family_instance_index': 8, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 633, 'byte_width': 8.0, 'null_percent': 0.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/2/9 112.936M items/sec 128.835M items/sec 14.078 {'family_index': 17, 'per_family_instance_index': 5, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 151, 'byte_width': 9.0, 'null_percent': 50.0}
TakeInt64MonotonicIndices/524288/0 815.474M items/sec 914.640M items/sec 12.161 {'family_index': 2, 'per_family_instance_index': 4, 'run_name': 'TakeInt64MonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1073, 'null_percent': 0.0}
TakeFixedSizeBinaryMonotonicIndices/524288/0/8 819.620M items/sec 916.155M items/sec 11.778 {'family_index': 5, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1088, 'byte_width': 8.0, 'null_percent': 0.0}
TakeChunkedFlatInt64RandomIndicesNoNulls/524288/0 483.752M items/sec 529.220M items/sec 9.399 {'family_index': 21, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedFlatInt64RandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 720, 'null_percent': 0.0}
TakeStringMonotonicIndices/524288/0 785.800M items/sec 851.423M items/sec 8.351 {'family_index': 11, 'per_family_instance_index': 4, 'run_name': 'TakeStringMonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1056, 'null_percent': 0.0}
TakeFSLInt64RandomIndicesWithNulls/524288/0 695.436M items/sec 750.561M items/sec 7.927 {'family_index': 7, 'per_family_instance_index': 4, 'run_name': 'TakeFSLInt64RandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 900, 'null_percent': 0.0}
TakeFSLInt64MonotonicIndices/524288/1 66.532M items/sec 71.388M items/sec 7.299 {'family_index': 8, 'per_family_instance_index': 3, 'run_name': 'TakeFSLInt64MonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 89, 'null_percent': 100.0}
TakeChunkedFlatInt64RandomIndicesWithNulls/524288/1 1.559G items/sec 1.668G items/sec 7.012 {'family_index': 22, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedFlatInt64RandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2086, 'null_percent': 100.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/2/9 90.181M items/sec 96.449M items/sec 6.950 {'family_index': 15, 'per_family_instance_index': 5, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 122, 'byte_width': 9.0, 'null_percent': 50.0}
TakeFSLInt64RandomIndicesNoNulls/524288/1 66.747M items/sec 71.100M items/sec 6.520 {'family_index': 6, 'per_family_instance_index': 3, 'run_name': 'TakeFSLInt64RandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 93, 'null_percent': 100.0}
TakeChunkedFlatInt64MonotonicIndices/524288/0 610.547M items/sec 649.274M items/sec 6.343 {'family_index': 23, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedFlatInt64MonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 828, 'null_percent': 0.0}
TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/0 407.444M items/sec 432.748M items/sec 6.210 {'family_index': 12, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 539, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/9 74.928M items/sec 78.752M items/sec 5.104 {'family_index': 4, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 97, 'byte_width': 9.0, 'null_percent': 50.0}
TakeFSLInt64RandomIndicesNoNulls/524288/0 719.162M items/sec 754.224M items/sec 4.875 {'family_index': 6, 'per_family_instance_index': 4, 'run_name': 'TakeFSLInt64RandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 941, 'null_percent': 0.0}
TakeFSLInt64RandomIndicesWithNulls/524288/2 38.928M items/sec 40.809M items/sec 4.832 {'family_index': 7, 'per_family_instance_index': 2, 'run_name': 'TakeFSLInt64RandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 53, 'null_percent': 50.0}
TakeChunkedChunkedStringRandomIndicesWithNulls/524288/10 30.625M items/sec 31.965M items/sec 4.375 {'family_index': 19, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedStringRandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 41, 'null_percent': 10.0}
TakeFSLInt64RandomIndicesWithNulls/524288/1 84.040M items/sec 87.072M items/sec 3.608 {'family_index': 7, 'per_family_instance_index': 3, 'run_name': 'TakeFSLInt64RandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 111, 'null_percent': 100.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/8 720.917M items/sec 746.809M items/sec 3.592 {'family_index': 4, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 974, 'byte_width': 8.0, 'null_percent': 0.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/2/9 69.333M items/sec 71.629M items/sec 3.311 {'family_index': 16, 'per_family_instance_index': 5, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 94, 'byte_width': 9.0, 'null_percent': 50.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/9 109.212M items/sec 112.669M items/sec 3.165 {'family_index': 4, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 147, 'byte_width': 9.0, 'null_percent': 10.0}
TakeStringRandomIndicesNoNulls/524288/1000 22.153M items/sec 22.833M items/sec 3.070 {'family_index': 9, 'per_family_instance_index': 0, 'run_name': 'TakeStringRandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 30, 'null_percent': 0.1}
TakeStringRandomIndicesWithNulls/524288/1 4.032G items/sec 4.156G items/sec 3.069 {'family_index': 10, 'per_family_instance_index': 3, 'run_name': 'TakeStringRandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 5358, 'null_percent': 100.0}
TakeStringRandomIndicesNoNulls/524288/2 61.033M items/sec 62.890M items/sec 3.044 {'family_index': 9, 'per_family_instance_index': 2, 'run_name': 'TakeStringRandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 80, 'null_percent': 50.0}
TakeChunkedChunkedStringRandomIndicesWithNulls/524288/2 56.312M items/sec 57.906M items/sec 2.831 {'family_index': 19, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedStringRandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 75, 'null_percent': 50.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/10/9 98.562M items/sec 101.317M items/sec 2.795 {'family_index': 16, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 131, 'byte_width': 9.0, 'null_percent': 10.0}
TakeChunkedChunkedStringRandomIndicesNoNulls/524288/1000 28.857M items/sec 29.634M items/sec 2.693 {'family_index': 18, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedStringRandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 37, 'null_percent': 0.1}
TakeChunkedChunkedStringMonotonicIndices/524288/0 47.779M items/sec 48.996M items/sec 2.545 {'family_index': 20, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedStringMonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 64, 'null_percent': 0.0}
TakeStringMonotonicIndices/524288/1 67.132M items/sec 68.571M items/sec 2.144 {'family_index': 11, 'per_family_instance_index': 3, 'run_name': 'TakeStringMonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 90, 'null_percent': 100.0}
TakeChunkedChunkedStringMonotonicIndices/524288/1 220.296M items/sec 223.992M items/sec 1.678 {'family_index': 20, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedStringMonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 295, 'null_percent': 100.0}
TakeChunkedFlatInt64RandomIndicesWithNulls/524288/0 516.783M items/sec 524.475M items/sec 1.489 {'family_index': 22, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedFlatInt64RandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 674, 'null_percent': 0.0}
TakeInt64RandomIndicesWithNulls/524288/0 708.727M items/sec 719.105M items/sec 1.464 {'family_index': 1, 'per_family_instance_index': 4, 'run_name': 'TakeInt64RandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 945, 'null_percent': 0.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/8 728.342M items/sec 738.750M items/sec 1.429 {'family_index': 3, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 969, 'byte_width': 8.0, 'null_percent': 0.0}
TakeChunkedChunkedStringRandomIndicesWithNulls/524288/0 28.581M items/sec 28.988M items/sec 1.423 {'family_index': 19, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedStringRandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 38, 'null_percent': 0.0}
TakeChunkedChunkedStringRandomIndicesWithNulls/524288/1 1.167G items/sec 1.184G items/sec 1.393 {'family_index': 19, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedStringRandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1556, 'null_percent': 100.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/8 7.216G items/sec 7.307G items/sec 1.262 {'family_index': 4, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 9536, 'byte_width': 8.0, 'null_percent': 100.0}
TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/0 435.696M items/sec 441.192M items/sec 1.261 {'family_index': 13, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 585, 'null_percent': 0.0}
TakeFSLInt64MonotonicIndices/524288/2 58.808M items/sec 59.212M items/sec 0.688 {'family_index': 8, 'per_family_instance_index': 2, 'run_name': 'TakeFSLInt64MonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 79, 'null_percent': 50.0}
TakeStringRandomIndicesNoNulls/524288/1 250.252M items/sec 250.508M items/sec 0.102 {'family_index': 9, 'per_family_instance_index': 3, 'run_name': 'TakeStringRandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 332, 'null_percent': 100.0}
TakeChunkedChunkedStringRandomIndicesWithNulls/524288/1000 28.414M items/sec 28.361M items/sec -0.186 {'family_index': 19, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedStringRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 38, 'null_percent': 0.1}
TakeStringRandomIndicesWithNulls/524288/10 33.878M items/sec 33.738M items/sec -0.413 {'family_index': 10, 'per_family_instance_index': 1, 'run_name': 'TakeStringRandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 45, 'null_percent': 10.0}
TakeChunkedChunkedStringRandomIndicesNoNulls/524288/2 56.097M items/sec 55.857M items/sec -0.427 {'family_index': 18, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedStringRandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 75, 'null_percent': 50.0}
TakeChunkedChunkedStringRandomIndicesNoNulls/524288/1 225.625M items/sec 224.634M items/sec -0.439 {'family_index': 18, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedStringRandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 304, 'null_percent': 100.0}
TakeChunkedChunkedStringMonotonicIndices/524288/1000 50.359M items/sec 49.834M items/sec -1.043 {'family_index': 20, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedStringMonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 66, 'null_percent': 0.1}
TakeInt64RandomIndicesWithNulls/524288/1 7.198G items/sec 7.112G items/sec -1.191 {'family_index': 1, 'per_family_instance_index': 3, 'run_name': 'TakeInt64RandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 9591, 'null_percent': 100.0}
TakeChunkedChunkedStringMonotonicIndices/524288/2 86.073M items/sec 84.996M items/sec -1.252 {'family_index': 20, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedStringMonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 114, 'null_percent': 50.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/0/8 430.819M items/sec 423.785M items/sec -1.633 {'family_index': 15, 'per_family_instance_index': 8, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 580, 'byte_width': 8.0, 'null_percent': 0.0}
TakeChunkedChunkedStringMonotonicIndices/524288/10 52.095M items/sec 51.106M items/sec -1.899 {'family_index': 20, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedStringMonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 69, 'null_percent': 10.0}
TakeStringRandomIndicesNoNulls/524288/10 23.805M items/sec 23.350M items/sec -1.909 {'family_index': 9, 'per_family_instance_index': 1, 'run_name': 'TakeStringRandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 31, 'null_percent': 10.0}
TakeInt64RandomIndicesNoNulls/524288/0 718.273M items/sec 700.958M items/sec -2.411 {'family_index': 0, 'per_family_instance_index': 4, 'run_name': 'TakeInt64RandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 966, 'null_percent': 0.0}
TakeChunkedChunkedStringRandomIndicesNoNulls/524288/10 33.176M items/sec 32.265M items/sec -2.748 {'family_index': 18, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedStringRandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 44, 'null_percent': 10.0}
TakeStringRandomIndicesWithNulls/524288/2 62.261M items/sec 60.517M items/sec -2.802 {'family_index': 10, 'per_family_instance_index': 2, 'run_name': 'TakeStringRandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 82, 'null_percent': 50.0}
TakeStringRandomIndicesNoNulls/524288/0 20.935M items/sec 20.275M items/sec -3.155 {'family_index': 9, 'per_family_instance_index': 4, 'run_name': 'TakeStringRandomIndicesNoNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 28, 'null_percent': 0.0}
TakeFSLInt64MonotonicIndices/524288/10 100.329M items/sec 97.103M items/sec -3.215 {'family_index': 8, 'per_family_instance_index': 1, 'run_name': 'TakeFSLInt64MonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 135, 'null_percent': 10.0}
TakeStringRandomIndicesWithNulls/524288/1000 22.013M items/sec 21.300M items/sec -3.238 {'family_index': 10, 'per_family_instance_index': 0, 'run_name': 'TakeStringRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 30, 'null_percent': 0.1}
TakeFSLInt64MonotonicIndices/524288/1000 138.319M items/sec 133.809M items/sec -3.261 {'family_index': 8, 'per_family_instance_index': 0, 'run_name': 'TakeFSLInt64MonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 184, 'null_percent': 0.1}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1/8 973.877M items/sec 941.054M items/sec -3.370 {'family_index': 16, 'per_family_instance_index': 6, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1287, 'byte_width': 8.0, 'null_percent': 100.0}
TakeStringRandomIndicesWithNulls/524288/0 20.697M items/sec 19.930M items/sec -3.707 {'family_index': 10, 'per_family_instance_index': 4, 'run_name': 'TakeStringRandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 28, 'null_percent': 0.0}
TakeFSLInt64RandomIndicesWithNulls/524288/10 68.597M items/sec 65.478M items/sec -4.547 {'family_index': 7, 'per_family_instance_index': 1, 'run_name': 'TakeFSLInt64RandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 92, 'null_percent': 10.0}
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (64)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/1 1.001G items/sec 950.629M items/sec -5.006 {'family_index': 13, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1320, 'null_percent': 100.0}
TakeFSLInt64RandomIndicesWithNulls/524288/1000 110.053M items/sec 103.738M items/sec -5.738 {'family_index': 7, 'per_family_instance_index': 0, 'run_name': 'TakeFSLInt64RandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 147, 'null_percent': 0.1}
TakeFSLInt64RandomIndicesNoNulls/524288/2 51.364M items/sec 48.349M items/sec -5.869 {'family_index': 6, 'per_family_instance_index': 2, 'run_name': 'TakeFSLInt64RandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 68, 'null_percent': 50.0}
TakeStringMonotonicIndices/524288/2 59.357M items/sec 55.607M items/sec -6.317 {'family_index': 11, 'per_family_instance_index': 2, 'run_name': 'TakeStringMonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 79, 'null_percent': 50.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/0/8 432.962M items/sec 405.540M items/sec -6.334 {'family_index': 16, 'per_family_instance_index': 8, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 579, 'byte_width': 8.0, 'null_percent': 0.0}
TakeStringMonotonicIndices/524288/1000 138.629M items/sec 128.420M items/sec -7.364 {'family_index': 11, 'per_family_instance_index': 0, 'run_name': 'TakeStringMonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 184, 'null_percent': 0.1}
TakeStringMonotonicIndices/524288/10 100.905M items/sec 92.883M items/sec -7.950 {'family_index': 11, 'per_family_instance_index': 1, 'run_name': 'TakeStringMonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 135, 'null_percent': 10.0}
TakeInt64MonotonicIndices/524288/2 221.337M items/sec 202.318M items/sec -8.593 {'family_index': 2, 'per_family_instance_index': 2, 'run_name': 'TakeInt64MonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 305, 'null_percent': 50.0}
TakeChunkedChunkedInt64MonotonicIndices/524288/10 255.436M items/sec 230.562M items/sec -9.738 {'family_index': 14, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedInt64MonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 344, 'null_percent': 10.0}
TakeChunkedChunkedInt64MonotonicIndices/524288/2 195.065M items/sec 174.561M items/sec -10.511 {'family_index': 14, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedInt64MonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 260, 'null_percent': 50.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/2/8 195.399M items/sec 174.380M items/sec -10.757 {'family_index': 17, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 261, 'byte_width': 8.0, 'null_percent': 50.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/10/8 259.057M items/sec 230.236M items/sec -11.125 {'family_index': 17, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 345, 'byte_width': 8.0, 'null_percent': 10.0}
TakeFSLInt64RandomIndicesNoNulls/524288/10 91.682M items/sec 81.282M items/sec -11.344 {'family_index': 6, 'per_family_instance_index': 1, 'run_name': 'TakeFSLInt64RandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 124, 'null_percent': 10.0}
TakeChunkedFlatInt64MonotonicIndices/524288/2 206.386M items/sec 182.888M items/sec -11.385 {'family_index': 23, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedFlatInt64MonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 274, 'null_percent': 50.0}
TakeFixedSizeBinaryMonotonicIndices/524288/2/8 230.747M items/sec 203.991M items/sec -11.595 {'family_index': 5, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 304, 'byte_width': 8.0, 'null_percent': 50.0}
TakeChunkedChunkedInt64MonotonicIndices/524288/1000 309.856M items/sec 273.410M items/sec -11.762 {'family_index': 14, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedInt64MonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 421, 'null_percent': 0.1}
TakeChunkedFlatInt64MonotonicIndices/524288/10 282.721M items/sec 247.747M items/sec -12.370 {'family_index': 23, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedFlatInt64MonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 388, 'null_percent': 10.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/1000/8 314.628M items/sec 274.680M items/sec -12.697 {'family_index': 17, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 417, 'byte_width': 8.0, 'null_percent': 0.1}
TakeInt64MonotonicIndices/524288/10 325.126M items/sec 283.762M items/sec -12.723 {'family_index': 2, 'per_family_instance_index': 1, 'run_name': 'TakeInt64MonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 435, 'null_percent': 10.0}
TakeFSLInt64RandomIndicesNoNulls/524288/1000 121.619M items/sec 105.817M items/sec -12.993 {'family_index': 6, 'per_family_instance_index': 0, 'run_name': 'TakeFSLInt64RandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 162, 'null_percent': 0.1}
TakeChunkedFlatInt64MonotonicIndices/524288/1000 346.368M items/sec 301.325M items/sec -13.004 {'family_index': 23, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedFlatInt64MonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 476, 'null_percent': 0.1}
TakeChunkedFlatInt64RandomIndicesWithNulls/524288/2 103.172M items/sec 89.469M items/sec -13.282 {'family_index': 22, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedFlatInt64RandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 137, 'null_percent': 50.0}
TakeChunkedChunkedFSBMonotonicIndices/524288/1/8 430.312M items/sec 371.092M items/sec -13.762 {'family_index': 17, 'per_family_instance_index': 6, 'run_name': 'TakeChunkedChunkedFSBMonotonicIndices/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 576, 'byte_width': 8.0, 'null_percent': 100.0}
TakeInt64MonotonicIndices/524288/1000 412.287M items/sec 350.592M items/sec -14.964 {'family_index': 2, 'per_family_instance_index': 0, 'run_name': 'TakeInt64MonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 553, 'null_percent': 0.1}
TakeChunkedChunkedInt64MonotonicIndices/524288/1 436.243M items/sec 369.718M items/sec -15.250 {'family_index': 14, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedInt64MonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 582, 'null_percent': 100.0}
TakeFixedSizeBinaryMonotonicIndices/524288/10/8 333.154M items/sec 282.199M items/sec -15.295 {'family_index': 5, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 444, 'byte_width': 8.0, 'null_percent': 10.0}
TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/2 149.951M items/sec 127.005M items/sec -15.302 {'family_index': 12, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 195, 'null_percent': 50.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/8 178.236M items/sec 150.734M items/sec -15.430 {'family_index': 3, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 239, 'byte_width': 8.0, 'null_percent': 50.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/2/8 104.347M items/sec 87.711M items/sec -15.943 {'family_index': 16, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 139, 'byte_width': 8.0, 'null_percent': 50.0}
TakeInt64RandomIndicesWithNulls/524288/2 110.744M items/sec 93.020M items/sec -16.005 {'family_index': 1, 'per_family_instance_index': 2, 'run_name': 'TakeInt64RandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 149, 'null_percent': 50.0}
TakeFixedSizeBinaryMonotonicIndices/524288/1000/8 416.393M items/sec 349.053M items/sec -16.172 {'family_index': 5, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 569, 'byte_width': 8.0, 'null_percent': 0.1}
TakeInt64RandomIndicesNoNulls/524288/2 175.322M items/sec 146.187M items/sec -16.618 {'family_index': 0, 'per_family_instance_index': 2, 'run_name': 'TakeInt64RandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 234, 'null_percent': 50.0}
TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/1 409.102M items/sec 340.247M items/sec -16.831 {'family_index': 12, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 544, 'null_percent': 100.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/2/8 154.752M items/sec 126.380M items/sec -18.334 {'family_index': 15, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 205, 'byte_width': 8.0, 'null_percent': 50.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/8 115.753M items/sec 94.453M items/sec -18.401 {'family_index': 4, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 153, 'byte_width': 8.0, 'null_percent': 50.0}
TakeChunkedFlatInt64RandomIndicesNoNulls/524288/2 170.047M items/sec 137.985M items/sec -18.855 {'family_index': 21, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedFlatInt64RandomIndicesNoNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 227, 'null_percent': 50.0}
TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/2 104.949M items/sec 84.181M items/sec -19.789 {'family_index': 13, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 140, 'null_percent': 50.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/10/8 215.087M items/sec 172.509M items/sec -19.796 {'family_index': 15, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 294, 'byte_width': 8.0, 'null_percent': 10.0}
TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/10 152.248M items/sec 121.846M items/sec -19.969 {'family_index': 13, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 205, 'null_percent': 10.0}
TakeChunkedFlatInt64RandomIndicesWithNulls/524288/10 163.847M items/sec 131.028M items/sec -20.030 {'family_index': 22, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedFlatInt64RandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 220, 'null_percent': 10.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/10/8 160.033M items/sec 127.702M items/sec -20.202 {'family_index': 16, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 213, 'byte_width': 8.0, 'null_percent': 10.0}
TakeChunkedFlatInt64MonotonicIndices/524288/1 534.901M items/sec 425.783M items/sec -20.400 {'family_index': 23, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedFlatInt64MonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 708, 'null_percent': 100.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/8 265.693M items/sec 210.311M items/sec -20.844 {'family_index': 3, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 359, 'byte_width': 8.0, 'null_percent': 10.0}
TakeInt64MonotonicIndices/524288/1 716.931M items/sec 565.379M items/sec -21.139 {'family_index': 2, 'per_family_instance_index': 3, 'run_name': 'TakeInt64MonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 938, 'null_percent': 100.0}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1000/8 242.890M items/sec 190.956M items/sec -21.381 {'family_index': 15, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 322, 'byte_width': 8.0, 'null_percent': 0.1}
TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/1000 222.384M items/sec 173.899M items/sec -21.802 {'family_index': 13, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 295, 'null_percent': 0.1}
TakeChunkedFlatInt64RandomIndicesWithNulls/524288/1000 242.960M items/sec 189.830M items/sec -21.868 {'family_index': 22, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedFlatInt64RandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 323, 'null_percent': 0.1}
TakeChunkedFlatInt64RandomIndicesNoNulls/524288/1 519.779M items/sec 404.965M items/sec -22.089 {'family_index': 21, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedFlatInt64RandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 717, 'null_percent': 100.0}
TakeInt64RandomIndicesWithNulls/524288/10 180.171M items/sec 139.484M items/sec -22.582 {'family_index': 1, 'per_family_instance_index': 1, 'run_name': 'TakeInt64RandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 240, 'null_percent': 10.0}
TakeInt64RandomIndicesNoNulls/524288/10 263.139M items/sec 202.899M items/sec -22.893 {'family_index': 0, 'per_family_instance_index': 1, 'run_name': 'TakeInt64RandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 351, 'null_percent': 10.0}
TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1000/8 229.423M items/sec 176.622M items/sec -23.014 {'family_index': 16, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesWithNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 307, 'byte_width': 8.0, 'null_percent': 0.1}
TakeChunkedFlatInt64RandomIndicesNoNulls/524288/1000 273.918M items/sec 210.633M items/sec -23.104 {'family_index': 21, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedFlatInt64RandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 368, 'null_percent': 0.1}
TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/1000 231.446M items/sec 176.407M items/sec -23.781 {'family_index': 12, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 308, 'null_percent': 0.1}
TakeFixedSizeBinaryMonotonicIndices/524288/1/8 750.779M items/sec 571.303M items/sec -23.905 {'family_index': 5, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 994, 'byte_width': 8.0, 'null_percent': 100.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/8 187.491M items/sec 142.620M items/sec -23.933 {'family_index': 4, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 247, 'byte_width': 8.0, 'null_percent': 10.0}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/8 304.735M items/sec 230.960M items/sec -24.210 {'family_index': 3, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 409, 'byte_width': 8.0, 'null_percent': 0.1}
TakeInt64RandomIndicesNoNulls/524288/1 687.284M items/sec 519.230M items/sec -24.452 {'family_index': 0, 'per_family_instance_index': 3, 'run_name': 'TakeInt64RandomIndicesNoNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 905, 'null_percent': 100.0}
TakeInt64RandomIndicesWithNulls/524288/1000 283.026M items/sec 212.189M items/sec -25.028 {'family_index': 1, 'per_family_instance_index': 0, 'run_name': 'TakeInt64RandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 376, 'null_percent': 0.1}
TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/8 711.122M items/sec 531.042M items/sec -25.323 {'family_index': 3, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 942, 'byte_width': 8.0, 'null_percent': 100.0}
TakeInt64RandomIndicesNoNulls/524288/1000 298.449M items/sec 222.481M items/sec -25.454 {'family_index': 0, 'per_family_instance_index': 0, 'run_name': 'TakeInt64RandomIndicesNoNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 401, 'null_percent': 0.1}
TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1/8 425.031M items/sec 315.623M items/sec -25.741 {'family_index': 15, 'per_family_instance_index': 6, 'run_name': 'TakeChunkedChunkedFSBRandomIndicesNoNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 566, 'byte_width': 8.0, 'null_percent': 100.0}
TakeChunkedFlatInt64RandomIndicesNoNulls/524288/10 247.351M items/sec 180.840M items/sec -26.890 {'family_index': 21, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedFlatInt64RandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 327, 'null_percent': 10.0}
TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/10 209.893M items/sec 153.392M items/sec -26.919 {'family_index': 12, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedInt64RandomIndicesNoNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 282, 'null_percent': 10.0}
TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/8 293.743M items/sec 204.054M items/sec -30.533 {'family_index': 4, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 391, 'byte_width': 8.0, 'null_percent': 0.1}
I've run the Take micro-benchmarks locally with this (AMD Zen 2, gcc 12.3.0). The changes a bit all over the place and show that compilers are generally capricious and difficult to steer towards "optimal" code :-)
Do you want me to copy and paste the code so that gcc can inline modular code correctly?
EDIT: The things that the PR improves are much faster. The ChunkedChunked benchmarks are -30% +30% because they are the benchmarks of concatenation which has high variance due to the number of huge memory operations they perform. I'm improving ChunkedChunked (TakeCC) in my follow-up PR.
Do you want me to copy and paste the code so that
gcccan inline modular code correctly?
No, it's not important. I was just posting the results for information.
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 4f890977650a36abaaec74ad2eaac31c04b5bf76.
There were no benchmark performance regressions. 🎉
The full Conbench report has more details. It also includes information about 20 possible false positives for unstable benchmarks that are known to sometimes produce them.