incubator-gluten
incubator-gluten copied to clipboard
Crash when writing an array of struct
Backend
VL (Velox)
Bug description
This write operation crashes:
spark.sql("select array(struct(1), null)").write.mode("overwrite").save("X")
Spark version
Spark-3.3.x
Spark configurations
spark.plugins=io.glutenproject.GlutenPlugin spark.gluten.sql.columnar.backend.lib=velox spark.memory.offHeap.enabled=true spark.memory.offHeap.size=28g
System information
Velox System Info v0.0.2 Commit: 58a459bf487120208a774d7959f7c7db417f490b CMake Version: 3.25.1 System: Linux-6.5.0-1015-azure Arch: x86_64 C++ Compiler: /usr/bin/c++ C++ Compiler Version: 11.4.0 C Compiler: /usr/bin/cc C Compiler Version: 11.4.0 CMake Prefix Path: /usr/local;/usr;/;/ssd/linuxbrew/.linuxbrew/Cellar/cmake/3.25.1;/usr/local;/usr/X11R6;/usr/pkg;/opt
Relevant logs
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fb40369e2c5, pid=281178, tid=0x00007fc434d18640
#
# JRE version: OpenJDK Runtime Environment (8.0_392-b08) (build 1.8.0_392-8u392-ga-1~22.04-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.392-b08 mixed mode linux-amd64 )
# Problematic frame:
# C [libvelox.so+0x229e2c5] (anonymous namespace)::makeRowVector(std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > > const&)+0xa5
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /ssd/chungmin/repos/spark_3.3/hs_err_pid281178.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
It seems the complex data type is not correctly fallback there https://github.com/apache/incubator-gluten/issues/4110
CC: @JkSelf
@clee704 @zhouyuan The Velox backend lacks support for writing complex types. Spark 34 has already implemented a fallback mechanism. PR (https://github.com/apache/incubator-gluten/pull/5107) has been filed to extend this fallback to Spark versions 32 and 33.
@JkSelf Actually it crashes on Spark 3.4 too. Please note that it happens without the native writer enabled.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fa4a369e2c5, pid=915090, tid=915112
#
# JRE version: OpenJDK Runtime Environment (11.0.22+7) (build 11.0.22+7-post-Ubuntu-0ubuntu222.04.1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.22+7-post-Ubuntu-0ubuntu222.04.1, mixed mode, tiered, g1 gc, linux-amd64)
# Problematic frame:
# C [libvelox.so+0x229e2c5] (anonymous namespace)::makeRowVector(std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > > const&)+0xa5
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /ssd/chungmin/repos/spark/core.915090)
#
# An error report file with more information is saved as:
# /ssd/chungmin/repos/spark/hs_err_pid915090.log
#
# If you would like to submit a bug report, please visit:
# https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Spark 3.4.2 Gluten 58a459bf487120208a774d7959f7c7db417f490b
@clee704 Sorry for delayed response. It seems this issue has been fixed here. Can you help to verify in your environment?
Confirmed it's fixed in the latest commit. Thanks!