owl
owl copied to clipboard
`libgfortran.so` not linked and causing compilation error
Thank you very much for the great work with this package.
I'm trying to build this package on a CentOS7 machine with gcc
v11. I have an OpenBLAS installation compiled from source and a corresponding shared library object whose directory is linked to in the OWL_FLAGS
which I have changed and provided below (with some of the directories replaces with ...
for space).
However, I'm receiving the following error:
#=== ERROR while compiling owl.1.1 ============================================#
# context 2.1.3 | linux/x86_64 | ocaml-option-flambda.1 ocaml-option-fp.1 ocaml-option-nnp.1 ocaml-variants.4.14.0+options | file:///.../opam-repository#25bba47d
# path .../.opam-switch/build/owl.1.1
# command .../bin/dune build -p owl -j 79
# exit-code 1
# env-file .../log/owl-16623-12e153.env
# output-file .../log/owl-16623-12e153.out
### output ###
# [...]
# -> stdout:
# -> stderr:
# | .../bin/ld: warning: libgfortran.so.5, needed by .../libopenblas.so, not found (try using -rpath or -rpath-link)
# | .../libopenblas.so: undefined reference to `_gfortran_etime@GFORTRAN_8'
# | .../libopenblas.so: undefined reference to `_gfortran_concat_string@GFORTRAN_8'
# | collect2: error: ld returned 1 exit status
# Fatal error: exception Failure("Unable to link against openblas.")
# Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33
# Called from Dune__exe__Configure.(fun) in file "src/owl/config/configure.ml", line 223, characters 8-51
# Called from Configurator__V1.main in file "otherlibs/configurator/src/v1.ml", line 734, characters 4-7
# Re-raised at Configurator__V1.main in file "otherlibs/configurator/src/v1.ml", line 742, characters 11-42
# Called from Dune__exe__Configure in file "src/owl/config/configure.ml", line 186, characters 2-1023
Here are the OWL_FLAGS
:
OWL_CFLAGS='-g -O3 -Ofast -mfpmath=sse -funroll-loops -ffast-math -DSFMT_MEXP=19937 -msse2 -fno-strict-aliasing -Wno-tautological-constant-out-of-range-compare -pipe -Wl,-z,relro -gdwarf-4 -gstrict-dwarf -march=core-avx2 -mtune=skylake -nostdinc -fdebug-prefix-map=.../supercaml -isystem/.../include -fPIC -L/.../OpenBLAS/0.3.20/.../lib -Wl,-rpath=/.../OpenBLAS/0.3.20/.../lib -L/h.../libgcc/11.x/.../lib -Wl,-rpath=/.../libgcc/11.x/.../lib -L/.../glibc/2.34/.../lib -Wl,-rpath-link,/.../glibc/2.34/.../lib -Wl,--hash-style,gnu -Wl,--dynamic-linker=/usr/local/.../lib/ld-linux-x86-64.so.2'
FWIW, I know that libgfortran.so.5
is linked to in the -L
and Wl,-rpath=
flags containing ligcc
in my OWL_FLAGS
. I believe I'm using the same gcc
and libgfortran
that were used to build OpenBLAS
.
Admittedly, I'm not an expert properly linking libraries in gcc
; however, it seems that this might be remedied by adding -lgfortran
as a library like -lopenblas
is passed to gcc
.
If there's any chance you happen to see an obvious problem in this setup; I'd really appreciate it. Otherwise, perhaps it'd be worth considering adding another environment variable which would add libraries like -lopenblas
is added to the gcc
call. (I tried to add the -lgfortran
as the final flag in OWL_CFLAGS
, but it didn't solve the error; however, I believe that might just be because of the position in the gcc
call.)
I'd try to change this myself and see if it worked, but the build system I'm working on is strange and I can't easily do so. I also think there's a reasonable chance that I'm doing something obviously wrong with the above setup.
FYI, we also have a static library for OpenBLAS (using -Bstatic
on the relevant -Wl
flag in OWL_CFLAGS
), and I've tried linking to that instead of the shared library; however, that resulted in conf-openblas
, whose opam
file I modified to use $OWL_CFLAGS
instead of $CFLAGS
in the build
section) failing with different errors (conf-openblas
would still try to use the shared library object, but the failure would be related to another library).
I'm suspecting this issue is related to the location of the flags in the call to gcc
/lack of -lgfortran
as a lib/the inability to use -l:libopenblas.a
instead of -lopenblas
. Perhaps we could add an OWL_LDLIBS
environment variable which would default to -lopenblas
or perhaps -lm -lopenblas
?
I can fork this repo and try it out, unless you see something obviously wrong with the previous implementation.
For completeness, here is my PKG_CONFIG_PATH
. Note that OpenBLAS
is present.
PKG_CONFIG_PATH=/.../pcre/8.43/.../lib/pkgconfig:/.../libev/.../lib/pkgconfig:/.../OpenBLAS/0.3.20/.../lib/pkgconfig:/.../gmp/6.1.2/.../lib/pkgconfig:/.../libjpeg/.../lib/pkgconfig:/.../libpng/1.6.37/.../lib/pkgconfig:/.../re2/20190601/.../lib/pkgconfig:/.../sqlite/3.36/.../lib/pkgconfig:/.../zlib/1.2.8/.../lib/pkgconfig:/.../readline/8.0/.../lib/pkgconfig:/.../ncurses/6.1/.../lib/pkgconfig
If I run pkg-config
in the build environment, I get the following.
+ pkg-config --cflags openblas
-I/.../OpenBLAS/0.3.20/.../include
+ pkg-config --libs openblas
-L/.../OpenBLAS/0.3.20/.../lib -lopenblas
So I believe OpenBLAS and pkg-config
are interacting appropriately.
FWIW, I removed OpenBLAS from PKG_CONFIG_PATH
to see if it was interfering with the other OWL_CFLAGS
; it failed with the same error.
Probably worthwhile to put an OWL_LDFLAGS
too so we don't have to jam everything into something named LDLIBS
. Perhaps just have OWL_
versions for CPP_FLAGS
, LDFLAGS
, LDLIBS
which would be placed in the right location.
Actually, the error might just be coming from this test
where OWL_CFLAGS
are ignored. Perhaps this should be moved after cflags
are set? Presumably we want to run the test with the same parameters that will be used for the build.
I'm thinking this error might be coming up is because the build script has a set -e
in it. I believe C.c_test
should just report if the compilation succeeded or failed, not fail (right?). Perhaps the set -e
is killing the process.
Disregard that; it's failing from this test
which kills the build.
This is solved with a patch akin to #636