datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Q23 fails when running TPC-DS SF=1 because of invalid offset buffer being exported for empty StringArray.

Open Kontinuation opened this issue 8 months ago • 1 comments

Describe the bug

Running TPC-DS SF=1 using queries-spark/q23.sql in datafusion-benchmarks fails after https://github.com/apache/datafusion-comet/pull/1605 is merged. The exception is raised by the native side:

org.apache.comet.CometNativeException: range end index 18446744072743568078 out of range for slice of length 0
        at comet::errors::init::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/errors.rs:151)
        at <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/boxed.rs:2007)
        at std::panicking::rust_panic_with_hook(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:836)
        at std::panicking::begin_panic_handler::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:701)
        at std::sys::backtrace::__rust_end_short_backtrace(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/sys/backtrace.rs:168)
        at rust_begin_unwind(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:692)
        at core::panicking::panic_fmt(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panicking.rs:75)
        at core::slice::index::slice_end_index_len_fail::do_panic::runtime(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panic.rs:218)
        at core::slice::index::slice_end_index_len_fail::do_panic(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/intrinsics/mod.rs:3869)
        at core::slice::index::slice_end_index_len_fail(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/panic.rs:223)
        at <core::ops::range::Range<usize> as core::slice::index::SliceIndex<[T]>>::index(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/slice/index.rs:437)
        at core::slice::index::<impl core::ops::index::Index<I> for [T]>::index(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/slice/index.rs:16)
        at arrow_data::transform::variable_size::extend_offset_values(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/variable_size.rs:38)
        at arrow_data::transform::variable_size::build_extend::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/variable_size.rs:57)
        at <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/boxed.rs:2007)
        at arrow_data::transform::MutableArrayData::extend(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-data-54.2.1/src/transform/mod.rs:722)
        at comet::execution::operators::copy::copy_array(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:233)
        at comet::execution::operators::copy::copy_or_unpack_array(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:280)
        at comet::execution::operators::copy::CopyStream::copy::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:196)
        at core::iter::adapters::map::map_try_fold::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/map.rs:95)
        at core::iter::traits::iterator::Iterator::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:2370)
        at <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/map.rs:121)
        at <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:191)
        at core::iter::traits::iterator::Iterator::try_for_each(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:2431)
        at <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:174)
        at alloc::vec::Vec<T,A>::extend_desugared(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/mod.rs:3535)
        at <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_extend.rs:19)
        at <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_from_iter_nested.rs:42)
        at <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/spec_from_iter.rs:34)
        at <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/alloc/src/vec/mod.rs:3427)
        at core::iter::traits::iterator::Iterator::collect(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:1971)
        at <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter::{{closure}}(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/result.rs:1985)
        at core::iter::adapters::try_process(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/adapters/mod.rs:160)
        at <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/result.rs:1985)
        at core::iter::traits::iterator::Iterator::collect(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/iter/traits/iterator.rs:1971)
        at comet::execution::operators::copy::CopyStream::copy(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:193)
        at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:214)
        at core::task::poll::Poll<T>::map(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/task/poll.rs:54)
        at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:213)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at <S as futures_core::stream::TryStream>::try_poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:206)
        at <futures_util::stream::try_stream::try_fold::TryFold<St,Fut,T,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/try_stream/try_fold.rs:81)
        at datafusion_physical_plan::joins::hash_join::collect_left_input::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:960)
        at <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/map.rs:55)
        at <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/lib.rs:86)
        at <core::pin::Pin<P> as core::future::future::Future>::poll(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/core/src/future/future.rs:124)
        at <futures_util::future::future::shared::Shared<Fut> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/shared.rs:322)
        at futures_util::future::future::FutureExt::poll_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/mod.rs:558)
        at datafusion_physical_plan::joins::utils::OnceFut<T>::get_shared(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/utils.rs:1091)
        at datafusion_physical_plan::joins::hash_join::HashJoinStream::collect_build_side(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1406)
        at datafusion_physical_plan::joins::hash_join::HashJoinStream::poll_next_impl(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1381)
        at <datafusion_physical_plan::joins::hash_join::HashJoinStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1628)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <comet::execution::operators::copy::CopyStream as futures_core::stream::Stream>::poll_next(/home/wherobots/datafusion-comet/native/core/src/execution/operators/copy.rs:213)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at datafusion_physical_plan::joins::hash_join::HashJoinStream::fetch_probe_batch(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1427)
        at datafusion_physical_plan::joins::hash_join::HashJoinStream::poll_next_impl(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1384)
        at <datafusion_physical_plan::joins::hash_join::HashJoinStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/joins/hash_join.rs:1628)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/projection.rs:354)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <datafusion_physical_plan::aggregates::row_hash::GroupedHashAggregateStream as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-46.0.0/src/aggregates/row_hash.rs:647)
        at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
        at futures_util::stream::stream::StreamExt::poll_next_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
        at <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/next.rs:32)
        at futures_util::future::future::FutureExt::poll_unpin(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/mod.rs:558)
        at <futures_util::async_await::poll::PollOnce<F> as core::future::future::Future>::poll(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/async_await/poll.rs:37)
        at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:544)
        at tokio::runtime::park::CachedParkThread::block_on::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/park.rs:284)
        at tokio::task::coop::with_budget(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/task/coop/mod.rs:167)
        at tokio::task::coop::budget(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/task/coop/mod.rs:133)
        at tokio::runtime::park::CachedParkThread::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/park.rs:284)
        at tokio::runtime::context::blocking::BlockingRegionGuard::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/context/blocking.rs:66)
        at tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/scheduler/multi_thread/mod.rs:87)
        at tokio::runtime::context::runtime::enter_runtime(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/context/runtime.rs:65)
        at tokio::runtime::scheduler::multi_thread::MultiThread::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/scheduler/multi_thread/mod.rs:86)
        at tokio::runtime::runtime::Runtime::block_on_inner(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/runtime.rs:370)
        at tokio::runtime::runtime::Runtime::block_on(/home/wherobots/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.44.1/src/runtime/runtime.rs:342)
        at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:544)
        at comet::errors::curry::{{closure}}(/home/wherobots/datafusion-comet/native/core/src/errors.rs:485)
        at std::panicking::try::do_call(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:584)
        at __rust_try(__internal__:0)
        at std::panicking::try(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panicking.rs:547)
        at std::panic::catch_unwind(/rustc/4eb161250e340c8f48f66e2b929ef4a5bed7c181/library/std/src/panic.rs:358)
        at comet::errors::try_unwrap_or_throw(/home/wherobots/datafusion-comet/native/core/src/errors.rs:499)
        at Java_org_apache_comet_Native_executePlan(/home/wherobots/datafusion-comet/native/core/src/execution/jni_api.rs:498)
        at <unknown>(__internal__:0)
        at org.apache.comet.Native.executePlan(Native Method)
        at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:137)
        at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:135)
        at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
        at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:135)
        at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:156)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
        at org.apache.comet.CometBatchIterator.hasNext(CometBatchIterator.java:50)
        at org.apache.comet.Native.executePlan(Native Method)
        at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:137)
        at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1$adapted(CometExecIterator.scala:135)
        at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:157)
        at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:135)
        at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:156)
        at org.apache.spark.sql.comet.execution.shuffle.CometNativeShuffleWriter.write(CometNativeShuffleWriter.scala:101)
        at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
        at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
        at org.apache.spark.scheduler.Task.run(Task.scala:141)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
        at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1589)

This is caused by auto-broadcasting the smaller side which contains empty record batches. The empty StringArrays in the empty record batches were not correctly exported through the Arrow C Data Interface. The very large value 18446744072743568078 in the error message is the first offset value in the offset buffer, it should be 0 when the array is empty (see Arrow Columnar Format Spec for details). However, it turns out to be some garbled data.

There were efforts in the past for fixing problems exporting empty var-sized binary array, https://github.com/apache/arrow/issues/40038 and the corresponding PR https://github.com/apache/arrow/pull/40043 exports a non-null offset buffers for empty arrays. However, this fix still has one problem: the newly allocated offset buffer is not properly initialized, which leaves garbled offset value in the offset buffer and produces this problem.

This problem cannot be reproduced on recent versions of macOS, because macOS fills freed memory blocks with 0, which is naturally the valid initial value for the offset buffer and covers up the problem.

Steps to reproduce

Run TPC-DS benchmark on Linux using https://github.com/apache/datafusion-benchmarks:

spark-submit \
    --master local[8] \
    --conf spark.driver.memory=3g \
    --conf spark.memory.offHeap.enabled=true \
    --conf spark.memory.offHeap.size=16g \
    --conf spark.jars=$COMET_JAR_PR \
    --conf spark.driver.extraClassPath=$COMET_JAR_PR \
    --conf spark.executor.extraClassPath=$COMET_JAR_PR \
    --conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions \
    --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
    --conf spark.comet.enabled=true \
    --conf spark.comet.exec.shuffle.enabled=true \
    --conf spark.comet.exec.shuffle.mode=auto \
    --conf spark.comet.exec.shuffle.compression.codec=lz4 \
    --conf spark.comet.exec.replaceSortMergeJoin=false \
    --conf spark.comet.exec.sortMergeJoinWithJoinFilter.enabled=false \
    --conf spark.comet.cast.allowIncompatible=true \
    --conf spark.comet.exec.shuffle.fallbackToColumnar=true \
    tpcbench.py \
    --benchmark tpcds \
    --data $TPCDS_DATA \
    --queries ../../tpcds/queries-spark \
    --output tpc-results

It will fail at the second query in Q23.

Expected behavior

TPC-DS Q23 should finish successfully.

Additional context

No response

Kontinuation avatar Apr 07 '25 04:04 Kontinuation

I have filed https://github.com/apache/arrow-java/pull/705 for fixing this.

Kontinuation avatar Apr 07 '25 08:04 Kontinuation

Apache Arrow Java has made a new release with the fix: https://github.com/apache/arrow-java/releases/tag/v18.3.0.

We can bump the version of arrow-java to close this issue.

Kontinuation avatar May 22 '25 00:05 Kontinuation