Q23 fails when running TPC-DS SF=1 because of invalid offset buffer being exported for empty StringArray.
Describe the bug
Running TPC-DS SF=1 using queries-spark/q23.sql in datafusion-benchmarks fails after https://github.com/apache/datafusion-comet/pull/1605 is merged. The exception is raised by the native side:
org.apache.comet.CometNativeException: range end index 18446744072743568078 out of range for slice of length 0
at comet::errors::init::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/errors.rs:151)
at <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/boxed.rs:2007)
at std::panicking::rust_panic_with_hook(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:836)
at std::panicking::begin_panic_handler::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:701)
at std::sys::backtrace::__rust_end_short_backtrace(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/sys/backtrace.rs:168)
at rust_begin_unwind(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:692)
at core::panicking::panic_fmt(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panicking.rs:75)
at core::slice::index::slice_end_index_len_fail::do_panic::runtime(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panic.rs:218)
at core::slice::index::slice_end_index_len_fail::do_panic(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/intrinsics/mod.rs:3869)
at core::slice::index::slice_end_index_len_fail(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panic.rs:223)
at <core::ops::range::Range<usize> as core::slice::index::SliceIndex<[T]>>::index(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/slice/index.rs:437)
at core::slice::index::<impl core::ops::index::Index<I> for [T]>::index(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/slice/index.rs:16)
at arrow_data::transform::variable_size::extend_offset_values(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/variable_size.rs:38)
at arrow_data::transform::variable_size::build_extend::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/variable_size.rs:57)
at <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/boxed.rs:2007)
at arrow_data::transform::MutableArrayData::extend(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/mod.rs:722)
at comet::execution::operators::copy::copy_array(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:233)
at comet::execution::operators::copy::copy_or_unpack_array(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:280)
at comet::execution::operators::copy::CopyStream::copy::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:196)
at core::iter::adapters::map::map_try_fold::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/map.rs:95)
at core::iter::traits::iterator::Iterator::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:2370)
at <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/map.rs:121)
at <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:191)
at core::iter::traits::iterator::Iterator::try_for_each(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:2431)
at <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:174)
at alloc::vec::Vec<T,A>::extend_desugared(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/mod.rs:3535)
at <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_extend.rs:19)
at <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_from_iter_nested.rs:42)
at <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_from_iter.rs:34)
at <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/mod.rs:3427)
at core::iter::traits::iterator::Iterator::collect(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:1971)
at <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/result.rs:1985)
at core::iter::adapters::try_process(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:160)
at <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/result.rs:1985)
at core::iter::traits::iterator::Iterator::collect(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:1971)
at comet::execution::operators::copy::CopyStream::copy(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:193)
at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:214)
at core::task::poll::Poll<T>::map(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/task/poll.rs:54)
at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:213)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at <S as futures_core::stream::TryStream>::try_poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:206)
at <futures_util::stream::try_stream::try_fold::TryFold<St,Fut,T,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/try_stream/try_fold.rs:81)
at datafusion_physical_plan::joins::hash_join::collect_left_input::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:960)
at <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/map.rs:55)
at <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/lib.rs:86)
at <core::pin::Pin<P> as core::future::future::Future>::poll(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/future/future.rs:124)
at <futures_util::future::future::shared::Shared<Fut> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/shared.rs:322)
at futures_util::future::future::FutureExt::poll_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/mod.rs:558)
at datafusion_physical_plan::joins::utils::OnceFut<T>::get_shared(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/utils.rs:1091)
at datafusion_physical_plan::joins::hash_join::HashJoinStream::collect_build_side(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1406)
at datafusion_physical_plan::joins::hash_join::HashJoinStream::poll_next_impl(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1381)
at <datafusion_physical_plan::joins::hash_join::HashJoinStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1628)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:213)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at datafusion_physical_plan::joins::hash_join::HashJoinStream::fetch_probe_batch(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1427)
at datafusion_physical_plan::joins::hash_join::HashJoinStream::poll_next_impl(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1384)
at <datafusion_physical_plan::joins::hash_join::HashJoinStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1628)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <datafusion_physical_plan::aggregates::row_hash::GroupedHashAggregateStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/aggregates/row_hash.rs:647)
at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
at <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/next.rs:32)
at futures_util::future::future::FutureExt::poll_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/mod.rs:558)
at <futures_util::async_await::poll::PollOnce<F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/async_await/poll.rs:37)
at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:544)
at tokio::runtime::park::CachedParkThread::block_on::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/park.rs:284)
at tokio::task::coop::with_budget(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/task/coop/mod.rs:167)
at tokio::task::coop::budget(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/task/coop/mod.rs:133)
at tokio::runtime::park::CachedParkThread::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/park.rs:284)
at tokio::runtime::context::blocking::BlockingRegionGuard::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/context/blocking.rs:66)
at tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/scheduler/multi_thread/mod.rs:87)
at tokio::runtime::context::runtime::enter_runtime(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/context/runtime.rs:65)
at tokio::runtime::scheduler::multi_thread::MultiThread::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/scheduler/multi_thread/mod.rs:86)
at tokio::runtime::runtime::Runtime::block_on_inner(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/runtime.rs:370)
at tokio::runtime::runtime::Runtime::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/runtime.rs:342)
at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:544)
at comet::errors::curry::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/errors.rs:485)
at std::panicking::try::do_call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:584)
at __rust_try(__internal__:0)
at std::panicking::try(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:547)
at std::panic::catch_unwind(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panic.rs:358)
at comet::errors::try_unwrap_or_throw(/home/wherobots/datafusion-comet/native/core/src/errors.rs:499)
at Java_org_apache_comet_Native_executePlan(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:498)
at <unknown>(__internal__:0)
at org.apache.comet.Native.executePlan(Native Method)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:137)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:135)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:135)
at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:156)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.comet.CometBatchIterator.hasNext(CometBatchIterator.java:50)
at org.apache.comet.Native.executePlan(Native Method)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:137)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:135)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:135)
at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:156)
at org.apache.spark.sql.comet.execution.shuffle.CometNativeShuffleWriter.write(CometNativeShuffleWriter.scala:101)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
at org.apache.spark.scheduler.Task.run(Task.scala:141)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1589)
This is caused by auto-broadcasting the smaller side which contains empty record batches. The empty StringArrays in the empty record batches were not correctly exported through the Arrow C Data Interface. The very large value 18446744072743568078 in the error message is the first offset value in the offset buffer, it should be 0 when the array is empty (see Arrow Columnar Format Spec for details). However, it turns out to be some garbled data.
There were efforts in the past for fixing problems exporting empty var-sized binary array, https://github.com/apache/arrow/issues/40038 and the corresponding PR https://github.com/apache/arrow/pull/40043 exports a non-null offset buffers for empty arrays. However, this fix still has one problem: the newly allocated offset buffer is not properly initialized, which leaves garbled offset value in the offset buffer and produces this problem.
This problem cannot be reproduced on recent versions of macOS, because macOS fills freed memory blocks with 0, which is naturally the valid initial value for the offset buffer and covers up the problem.
Steps to reproduce
Run TPC-DS benchmark on Linux using https://github.com/apache/datafusion-benchmarks:
spark-submit \
--master local[8] \
--conf spark.driver.memory=3g \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=16g \
--conf spark.jars=$COMET_JAR_PR \
--conf spark.driver.extraClassPath=$COMET_JAR_PR \
--conf spark.executor.extraClassPath=$COMET_JAR_PR \
--conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions \
--conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
--conf spark.comet.enabled=true \
--conf spark.comet.exec.shuffle.enabled=true \
--conf spark.comet.exec.shuffle.mode=auto \
--conf spark.comet.exec.shuffle.compression.codec=lz4 \
--conf spark.comet.exec.replaceSortMergeJoin=false \
--conf spark.comet.exec.sortMergeJoinWithJoinFilter.enabled=false \
--conf spark.comet.cast.allowIncompatible=true \
--conf spark.comet.exec.shuffle.fallbackToColumnar=true \
tpcbench.py \
--benchmark tpcds \
--data $TPCDS_DATA \
--queries ../../tpcds/queries-spark \
--output tpc-results
It will fail at the second query in Q23.
Expected behavior
TPC-DS Q23 should finish successfully.
Additional context
No response
I have filed https://github.com/apache/arrow-java/pull/705 for fixing this.
Apache Arrow Java has made a new release with the fix: https://github.com/apache/arrow-java/releases/tag/v18.3.0.
We can bump the version of arrow-java to close this issue.