datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Include GLIBC version with provided jars

Open lfdversluis opened this issue 2 months ago • 7 comments

What is the problem the feature request solves?

I have been keeping an eye on this project for a while and with Scala 2.13 supported and some bugfixes, I gave it a go. I tried using the provided JARs online, but when running a query against Spark, I get

java.lang.UnsatisfiedLinkError: /tmp/libcomet-17465649045345702051.so: /lib64/libm.so.6: version GLIBC_2.27' not found (required by /tmp/libcomet-17465649045345702051.so)`

We are running on RHEL 7 which has an older GLIBC. Perhaps the list on the website https://datafusion.apache.org/comet/user-guide/0.10/installation.html can include this information? For now, I think I'll have to build it from source.

Thanks for the nice work so far!

Describe the potential solution

No response

Additional context

No response

lfdversluis avatar Sep 30 '25 14:09 lfdversluis

I just compiled comet in a docker container running CentOS 7, yet I still get

Comet extension is disabled because of error when loading native lib. Falling back to Spark java.lang.UnsatisfiedLinkError: /tmp/libcomet-3482603494165917815.so: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /tmp/libcomet-3482603494165917815.so)

Is something pulling a different libc? The container definitely has the old glibc

ldd --version
ldd (GNU libc) 2.17

lfdversluis avatar Oct 01 '25 13:10 lfdversluis

Building on old OS should work. This is the solution explained at https://kobzol.github.io/rust/ci/2021/05/07/building-rust-binaries-in-ci-that-work-with-older-glibc.html too

martin-g avatar Oct 01 '25 13:10 martin-g

Thanks for filing the issue @lfdversluis, and thanks for the helpful link, @martin-g. I filed https://github.com/apache/datafusion-comet/issues/2511 for building the official releases with an older version of GLIBC. I will aim to do this work for the 0.11.0 release.

andygrove avatar Oct 01 '25 14:10 andygrove

@andygrove This has been discussed before: https://github.com/apache/datafusion-comet/issues/1499 https://github.com/apache/datafusion-comet/pull/932#issuecomment-2349614678

parthchandra avatar Oct 01 '25 21:10 parthchandra

@parthchandra Thanks for those two links, I missed #932, sorry! I kind of agree with not supporting EOL OSes such as RHEL / CentOS 7.

@martin-g thanks for that pointer! Yesterday I compiled rust against the system libs (I hope, I don't have experience with rust) and the jar still didn't run (same error). I'll give that one a go.

lfdversluis avatar Oct 02 '25 08:10 lfdversluis

This time comet did run (yay) and crashed the JVM 😅

 [...]
Couldn't locate log file from either COMET_CONF_DIR or comet.log.file.path. Using default log configuration which emits to stdout
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x00007fd166dff790, pid=12, tid=1171
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.15+6 (17.0.15+6) (build 17.0.15+6)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.15+6 (17.0.15+6, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libcomet-384294430278044736.so+0x38d4790]  _$LT$log4rs..encode..pattern..parser..Parser$u20$as$u20$core..iter..traits..iterator..Iterator$GT$::next::he2f3f88f45a2cff6+0xca0
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again

The conf dir nor comet.log.file.path are outlined in https://datafusion.apache.org/comet/user-guide/0.10/configs.html I'll check on how to disable logging for now. Pointers are welcome :)

lfdversluis avatar Oct 02 '25 09:10 lfdversluis

You can set COMET_CONF_DIR to a directory containing a log4rs.yaml file. The default one is located here: https://github.com/apache/datafusion-comet/tree/main/conf

parthchandra avatar Oct 02 '25 16:10 parthchandra