gazelle_plugin
gazelle_plugin copied to clipboard
[V1.4.0] gazelle plugin crash
Describe the bug
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGILL (0x4) at pc=0x00007f2687f73fa4, pid=27656, tid=0x00007f269e6f9700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_201-b09) (build 1.8.0_201-b09)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.201-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libspark_columnar_jni.so+0x3a5fa4] exprs::IfNode::InitAsDefaultInstance()+0x24
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /tmp/hadoop-manojk/nm-local-dir/usercache/manojk/appcache/application_1658472520631_0009/container_1658472520631_0009_01_000003/hs_err_pid27656.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.
FYI .. @PHILO-HE
Hi @Manoj-red-hat, I guess it is related to protobuf. You can try to build the project again with -Dbuild_protobuf=ON
to make sure a consistent protobuf version is used. If the issue still exists, please let me know your SQL statement for reproducing.
I am trying above with https://github.com/oap-project/gazelle_plugin/releases/download/v1.4.0/gazelle-plugin-1.4.0-spark-3.2.1.jar , its getting crashed
please let me know your SQL statement for reproducing. even a simple select statement is crashing
As per your suggestion, now I am trying to build jar on my local environment will update you futher
@weiting-chen, please help follow up. I guess we can provide conda env. package to help user quickly deploy gazelle for evaluation.
@Manoj-red-hat Could you help to check if the issue is happened "during" the query or "after" the query. I would like to identify if this is a known issue or not. Also let me know which OS version you are running?
The release in https://github.com/oap-project/gazelle_plugin/releases/download/v1.4.0/gazelle-plugin-1.4.0-spark-3.2.1.jar is using Ubuntu 20.04,there may be some dependency libraries issues when using other OS such as centos. It is better to compile by yourself to make sure all the dependency libraries can be found in your system.
@PHILO-HE @weiting-chen
I tried conda env but its also not work out.
Machine Info OS: Ubuntu 18.04.6 LTS x86_64 CPU: Intel i5-8400 (6) @ 4.000GHz
Finally I git clone and used below command,
mvn -Phadoop-3.2,spark-3.2 package -DskipTests -Dcheckstyle.skip -Dmaven.test.skip
Now I am able to use gazelle plugin.
Doing some debugging regarding TPCH-Q1
Just wondering why we are doing filtering on conditional filter operator,
When that can be handled by parquet arrow dataset reader, it will save our
projection time
Its only possible if input batches are from native parquet reader, but if
we are able to do that it gonna give us speed up
Any way for demonstrating above, I am creating standalone Q1 C++ gazelle test case.
@Manoj-red-hat Could you help to check if the issue is happened "during" the query or "after" the query. I would like to identify if this is a known issue or not. Also let me know which OS version you are running?
During the query
Machine Info OS: Ubuntu 18.04.6 LTS x86_64 CPU: Intel i5-8400 (6) @ 4.000GHz
@PHILO-HE @weiting-chen
I tried conda env but its also not work out.
Machine Info OS: Ubuntu 18.04.6 LTS x86_64 CPU: Intel i5-8400 (6) @ 4.000GHz
Finally I git clone and used below command,
mvn -Phadoop-3.2,spark-3.2 package -DskipTests -Dcheckstyle.skip -Dmaven.test.skip
Now I am able to use gazelle plugin.
Doing some debugging regarding TPCH-Q1
Just wondering why we are doing filtering on conditional filter operator, When that can be handled by parquet arrow dataset reader, it will save our projection time Its only possible if input batches are from native parquet reader, but if we are able to do that it gonna give us speed up
Any way for demonstrating above, I am creating standalone Q1 C++ gazelle test case.
Any issues when you are running TPC-H Q1 with Gazelle? I assume you have run it successfully, right?
@weiting-chen sorry for late reply got stuck in some-other work, anyways thanks for your support and indeed its a great project. Now I am exploring gazelle on tpcds side
Any issues when you are running TPC-H Q1 with Gazelle? I assume you have run it successfully, right?
Everything looks fine on TPCH.
Now I switched to TPCDS and started validating that benchmark There are few problems there, reporting them in seperate issue. like wrong result in Q4, will keep reporting if I face more