velox
velox copied to clipboard
Parquet reader: can't read parquet file with no column indexes
Bug description
When reading parquet example file test-file-with-no-column-indexes-1.parquet, following error pops up:
Job aborted due to stage failure: Task 0 in stage 14.0 failed 1 times, most recent failure: Lost task 0.0 in stage 14.0 (TID 14) (10.0.2.142 executor driver): org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::getOutput failed for [operator: TableScan, plan node ID: 0]: vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)
Retriable: False
Function: runInternal
File: ../../velox/exec/Driver.cpp
Line: 686
Stack trace:
# 0 facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)
# 1 void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
# 2 facebook::velox::exec::Driver::runInternal(std::shared_ptr<facebook::velox::exec::Driver>&, std::shared_ptr<facebook::velox::exec::BlockingState>&, std::shared_ptr<facebook::velox::RowVector>&) [clone .cold]
# 3 facebook::velox::exec::Driver::next(std::shared_ptr<facebook::velox::exec::BlockingState>&)
# 4 facebook::velox::exec::Task::next(folly::SemiFuture<folly::Unit>*)
# 5 gluten::WholeStageResultIterator::next()
# 6 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext
# 7 0x00007f75ad020907
# 8 0x00007f75ad0078ef
# 9 0x00007f75ad0078ef
# 10 0x00007f75adc66c6b
at org.apache.gluten.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:39)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
at org.apache.gluten.utils.InvocationFlowProtection.hasNext(Iterators.scala:135)
at org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69)
at org.apache.gluten.utils.PayloadCloser.hasNext(Iterators.scala:35)
at org.apache.gluten.utils.PipelineTimeAccumulator.hasNext(Iterators.scala:98)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator.isEmpty(Iterator.scala:387)
at scala.collection.Iterator.isEmpty$(Iterator.scala:387)
at org.apache.spark.InterruptibleIterator.isEmpty(InterruptibleIterator.scala:28)
at org.apache.gluten.execution.VeloxColumnarToRowExec$.toRowIterator(VeloxColumnarToRowExec.scala:119)
at
System information
Commit: d9454d63d190da9d30cae39a4dca9ac25b0da6b7 CMake Version: 3.16.3 System: Linux-5.4.0-156-generic Arch: x86_64 C++ Compiler: /usr/bin/c++ C++ Compiler Version: 9.4.0 C Compiler: /usr/bin/cc C Compiler Version: 9.4.0 CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt
Relevant logs
No response