arrow
arrow copied to clipboard
[C++] Crashed at TempStack alloc when use Hashing32::HashBatch independently
Describe the bug, including details regarding any error messages, version, and platform.
The issue is similar to https://github.com/apache/arrow/pull/40007, but they are different.
I want to use the Hashing32::HashBatch
api for produce a hash-array for a batch. Although the Hashing32
and Hashing64
are used in join based codes, but they can be used independently.
Like below codes:
auto arr = arrow::ArrayFromJSON(arrow::int32(), "[9,2,6]");
const int batch_len = arr->length();
arrow::compute::ExecBatch exec_batch({arr}, batch_len);
auto ctx = arrow::compute::default_exec_context();
arrow::util::TempVectorStack stack;
ASSERT_OK(stack.Init(ctx->memory_pool(), batch_len * sizeof(uint32_t))); // I just alloc the stack size as i needed.
std::vector<uint32_t> hashes(batch_len);
std::vector<arrow::compute::KeyColumnArray> temp_column_arrays;
ASSERT_OK(arrow::compute::Hashing32::HashBatch(
exec_batch, hashes.data(), temp_column_arrays,
ctx->cpu_info()->hardware_flags(), &stack, 0, batch_len));
The crash stack in HashBatch
is:
arrow::compute::Hashing32::HashBatch
arrow::compute::Hashing32::HashMultiColumn
arrow::util::TempVectorHolder<unsigned int>::TempVectorHolder
arrow::util::TempVectorStack::alloc
ARROW_DCHECK(top_ <= buffer_size_); // top_=4176, buffer_size_=160
The reason is blow codes: https://github.com/apache/arrow/blob/7e286dd004a8fcf2de0f58615793338076741208/cpp/src/arrow/compute/key_hash.cc#L385-L387
The holder use the max_batch_size
which is 1024
as it's num_elements, it's far more than the temp stack's init buffer_size
.
I know that the HashBatch
is only used in hash-join or related codes. For join, they have already done line clipping at the upper level, ensuring that each input batch size is less_equal to kMiniBatchLength
and the stack size is bigger enough.
But it can be used independently. So maybe we could use the num_rows
rather than util::MiniBatch::kMiniBatchLength
in HashBatch
related apis?
Component(s)
C++
cc @kou PTAL this issue?
Can you provide a buildable C++ code that reproduces this problem?
Can you provide a buildable C++ code that reproduces this problem?
Of course.
#include <arrow/compute/exec.h>
#include <arrow/compute/util.h>
#include <arrow/testing/gtest_util.h>
#include <arrow/testing/random.h>
#include <arrow/type_fwd.h>
#include <arrow/compute/light_array.h>
#include <arrow/compute/key_hash.h>
#include <arrow/util/async_util.h>
#include <arrow/util/future.h>
#include <arrow/util/task_group.h>
#include <arrow/util/thread_pool.h>
#include <arrow/util/logging.h>
#include <arrow/acero/options.h>
#include <arrow/compute/api_vector.h>
#include <arrow/memory_pool.h>
#include <arrow/record_batch.h>
#include <arrow/builder.h>
#include <arrow/result.h>
#include <arrow/array/diff.h>
#include <mutex>
#include <thread>
#include <unordered_map>
#include "gtest/gtest.h"
TEST(HashBatch, BasicTest) {
auto arr = arrow::ArrayFromJSON(arrow::int32(), "[9,2,6]");
const int batch_len = arr->length();
arrow::compute::ExecBatch exec_batch({arr}, batch_len);
auto ctx = arrow::compute::default_exec_context();
arrow::util::TempVectorStack stack;
ASSERT_OK(stack.Init(ctx->memory_pool(), batch_len * sizeof(uint32_t)));
std::vector<uint32_t> hashes(batch_len);
std::vector<arrow::compute::KeyColumnArray> temp_column_arrays;
ASSERT_OK(arrow::compute::Hashing32::HashBatch(
exec_batch, hashes.data(), temp_column_arrays,
ctx->cpu_info()->hardware_flags(), &stack, 0, batch_len));
for (int i = 0; i < batch_len; i++) {
std::cout << hashes[i] << " ";
}
}
cc @kou, do you think this problem needs to be solved?
Sorry. I missed this.
Thanks. I could run the code:
diff --git a/cpp/src/arrow/compute/key_hash_test.cc b/cpp/src/arrow/compute/key_hash_test.cc
index c998df7169..ccfddaa645 100644
--- a/cpp/src/arrow/compute/key_hash_test.cc
+++ b/cpp/src/arrow/compute/key_hash_test.cc
@@ -311,5 +311,24 @@ TEST(VectorHash, FixedLengthTailByteSafety) {
HashFixedLengthFrom(/*key_length=*/19, /*num_rows=*/64, /*start_row=*/63);
}
+TEST(HashBatch, BasicTest) {
+ auto arr = arrow::ArrayFromJSON(arrow::int32(), "[9,2,6]");
+ const int batch_len = arr->length();
+ arrow::compute::ExecBatch exec_batch({arr}, batch_len);
+ auto ctx = arrow::compute::default_exec_context();
+ arrow::util::TempVectorStack stack;
+ ASSERT_OK(stack.Init(ctx->memory_pool(), batch_len * sizeof(uint32_t)));
+
+ std::vector<uint32_t> hashes(batch_len);
+ std::vector<arrow::compute::KeyColumnArray> temp_column_arrays;
+ ASSERT_OK(arrow::compute::Hashing32::HashBatch(
+ exec_batch, hashes.data(), temp_column_arrays,
+ ctx->cpu_info()->hardware_flags(), &stack, 0, batch_len));
+
+ for (int i = 0; i < batch_len; i++) {
+ std::cout << hashes[i] << " ";
+ }
+}
+
} // namespace compute
} // namespace arrow
In general, allocating only required size is preferred. So using the num_rows
rather than util::MiniBatch::kMiniBatchLength
in HashBatch
related apis may be better. (Sorry, I can't determine whether the num_rows
is the correct size for it right now.)
But it seems that stack.Init(ctx->memory_pool(), batch_len * sizeof(uint32_t))
isn't enough for this case. Because we need at least 3 allocations for HashMultiColumn()
:
https://github.com/apache/arrow/blob/605f8a792c388afb2230b1f19e0f3e4df90d5abe/cpp/src/arrow/compute/key_hash.cc#L387-L395
And we also need 16 bytes metadata:
https://github.com/apache/arrow/blob/605f8a792c388afb2230b1f19e0f3e4df90d5abe/cpp/src/arrow/compute/util.cc#L35-L44
Anyway, could you try this?
Issue resolved by pull request 40484 https://github.com/apache/arrow/pull/40484