JuliaCall icon indicating copy to clipboard operation
JuliaCall copied to clipboard

julia_setup fails when Julia linked to MKL?

Open chriselrod opened this issue 8 years ago • 14 comments

julia <- julia_setup()
Julia version 0.7.0-DEV.1856 found.
Julia initiation...
WARNING: Error during initialization of module LinAlg:
ErrorException("could not load library "libmkl_rt"
libmkl_rt.so: cannot open shared object file: No such file or directory")
Finish Julia initiation.
Loading setup script for JuliaCall...
UndefVarError(:include) Error in .julia$cmd(paste0("include(\"", system.file("julia/setup.jl",  : 
  Error happens when you try to execute command include("/home/celrod/R/x86_64-pc-linux-gnu-library/3.4/JuliaCall/julia/setup.jl") in Julia.
In addition: Warning messages:
1: In utils::compareVersion(x, y) : NAs introduced by coercion
2: In utils::compareVersion(x, y) : NAs introduced by coercion

A friend got the same result when it found a JuliaPro v0.6.0 install connected to MKL. On my home computer (built with OpenBLAS), it worked.

I also have an OpenBLAS install at my school computer. As the error says: could not load library "libmkl_rt" It seems highly likely it is related. I'll pass that install's path to julia_call and confirm it works

Also! Saying julia_setup is time consuming seems like an understatement. Is that likely to change? Launching a Julia REPL and recompiling both Gadfly and Plots.jl seems faster. Have you profiled the function? What is it doing?

chriselrod avatar Oct 02 '17 12:10 chriselrod

Thank you for the feedback!

Where is libmkl_rt.so? Does adding the path to LD_LIBRARY_PATH before starting R helps?

The first time of julia_setup() could be quite time-consuming, because it has to install julia package dependencies like RCall.jl if they are not installed already and these dependencies need to be precompile for the first time. And the following startup could be faster.

The julia_setup() just do these things:

Locate Julia, check out the version of Julia. And compile a few C functions together with R and Julia header to embed julia in R. Currently the compilation is done every time at startup because Julia is evolve quite rapidly, and the C API also changes, and I want JuliaCall could handle the cases for different Julia versions, i.e, Julia v0.5 and Julia v0.6. Once Julia reaches a stable C API, the compilation process could be done only at the first time.

Install the Julia dependencies if needed and precompile them. Part of the work here is duplicated with the next step due to #14577 for Julia v0.6.0, https://github.com/JuliaLang/julia/issues/14577. We could save some work when the corresponding commit in the release of next version of Julia.

Start the embedding Julia in R, load the Julia dependencies and define some Julia functions for future use. There is not much things I could do about this step.

So basically, I expect the julia_setup() time to be reduced a lot in future versions of Julia, but not much space for improving currently.

Non-Contradiction avatar Oct 02 '17 15:10 Non-Contradiction

Hmm. For each of 2017 and 2018 I have 32 and 64 bit libmkl_rt.so libraries. Trying the two 64 bit libraries one at a time, or the set of all four each yield:

> library(JuliaCall)
> julia <- julia_setup()
Julia version 0.7.0-DEV.1856 found.
Julia initiation...
Finish Julia initiation.
Loading setup script for JuliaCall...
UndefVarError(:include) Error in .julia$cmd(paste0("include(\"", system.file("julia/setup.jl",  : 
  Error happens when you try to execute command include("/home/celrod/R/x86_64-pc-linux-gnu-library/3.4/JuliaCall/julia/setup.jl") in Julia.
In addition: Warning messages:
1: In utils::compareVersion(x, y) : NAs introduced by coercion
2: In utils::compareVersion(x, y) : NAs introduced by coercion

A different error. This also happens when I point it to a definitely-wrong directory with yet another libmkl_rt.so. If LD_LIBRARY_PATH however doesn't contain any directories with it, I get the could not load library "libmkl_rt" error.

When I tried giving it a path pointing to the OpenBLAS version (in a fresh session), I got:

> julia <- julia_setup()
Two passes with the same argument (-juliaO0) attempted to be registered!

 *** caught segfault ***
address (nil), cause 'unknown'

Traceback:
 1: dyn.load(.julia$dll_file, FALSE, TRUE)
 2: julia_setup()

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Cool, looking forward to seeing how things improve for v0.7.

chriselrod avatar Oct 02 '17 17:10 chriselrod

And I just realize that the code reveals another bug in JuliaCall and I will fix it very soon. But in my test, it seems that one of JuliaCall's dependency RCall.jl doesn't work for Julia v0.7, so even I correct the bug, JuliaCall will still not work for Julia v0.7 currently. I'll check RCall.jl again.

Non-Contradiction avatar Oct 02 '17 17:10 Non-Contradiction

Could you test installing and testing RCall.jl for Julia v0.7?

Non-Contradiction avatar Oct 02 '17 17:10 Non-Contradiction

I just checked out the latest master of RCall and of DataFrames. Same error:

ERROR: LoadError: LoadError: LoadError: type QuoteNode has no field args

from DataFrames as earlier. EDIT: https://travis-ci.org/JuliaData/DataFrames.jl/jobs/282139533 They got past precompiling DataFrames (although they failed the test), but I didn't. I'll have to look into that later and report back on whether that fixes RCall.

chriselrod avatar Oct 02 '17 17:10 chriselrod

One of the bugs is that previously I didn't realize that julia version number could have such things "-DEV.xxxx", which is not valid R version number. After fixing that, the error

UndefVarError(:include) Error in .julia$cmd(paste0("include(\"", system.file("julia/setup.jl",  : 
Error happens when you try to execute command include("/home/celrod/R/x86_64-pc-linux-gnu-library/3.4/JuliaCall/julia/setup.jl") in Julia.

still happens.

I'm looking at embedding documentation for Julia v0.7, and I have some clues. It seems that there is some important difference in embedding Julia from v0.6 to v0.7.

Non-Contradiction avatar Oct 02 '17 17:10 Non-Contradiction

Hmm, okay. Maybe you could ask on discourse? I imagine some people there'd know about differences that could explain things.

Just got DataFrames working (tests don't pass, but I can precompile it and make data frames). Now the error upon using RCall is:

ERROR: LoadError: LoadError: UndefVarError: DataArray not defined
Stacktrace:
 [1] include_relative(::Module, ::String) at ./loading.jl:464
 [2] include at ./sysimg.jl:14 [inlined]
 [3] include(::String) at /home/celrod/.julia/v0.7/RCall/src/RCall.jl:2
 [4] include_relative(::Module, ::String) at ./loading.jl:464
 [5] include(::Module, ::String) at ./sysimg.jl:14
 [6] anonymous at ./<missing>:2
while loading /home/celrod/.julia/v0.7/RCall/src/convert/dataframe.jl, in expression starting on line 3

chriselrod avatar Oct 02 '17 17:10 chriselrod

It seems there are some problems in using include function in embedding Julia v0.7, as the error message says. Other functions seem to work okay.

Non-Contradiction avatar Oct 02 '17 17:10 Non-Contradiction

The UndefVarError: DataArray not defined is because that in julia v0.6 DataFrames exports DataArray in DataArrays but not in julia v0.7. Might need to create a PR for RCall.

Non-Contradiction avatar Oct 03 '17 01:10 Non-Contradiction

After making a little modifications to RCall, JuliaCall is okay with Julia v0.7 with some deprecation warnings. I still need to check if RCall is okay with my modification before making a PR.

Non-Contradiction avatar Oct 03 '17 16:10 Non-Contradiction

Just for the record, ~~R will reset LD_LIBRARY_PATH when it is launched. Instead~~, you would need to set R_LD_LIBRARY_PATH.

UPDATE: setting LD_LIBRARY_PATH should also work, but it is better to use R_LD_LIBRARY_PATH if you don't want to mess up with your system settings. Moreover, R_LD_LIBRARY_PATH also works on macOS by setting DYLD_FALLBACK_LIBRARY_PATH instead of LD_LIBRARY_PATH elsewhere.

randy3k avatar Nov 21 '17 22:11 randy3k

Okay, I'd like to try and help get things working.

Has this issue been resolved? I'm not sure what is causing the error I'm seeing ("address boundary error"). I'm on Julia 1.3, linked with MKL.

> julia <- julia_setup()
Julia version 1.3.0-alpha.63 at location /home/chriselrod/Documents/languages/julia/usr/bin will be used.
Loading setup script for JuliaCall...

signal (11): Segmentation fault
in expression starting at /home/chriselrod/R/x86_64-generic-linux-gnu-library/3.6/JuliaCall/julia/setup.jl:72
unknown function (ip: 0x7f4e17970376)
Rf_ScalarInteger at /usr/lib64/R/lib/libR.so (unknown line)
__init__ at /home/chriselrod/.julia/packages/RCall/iojZI/src/setup.jl:162
unknown function (ip: 0x7f4e40192c21)
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1630 [inlined]
jl_module_run_initializer at /home/chriselrod/Documents/languages/julia/src/toplevel.c:74
jl_init_restored_modules at /home/chriselrod/Documents/languages/julia/src/dump.c:2469
_include_from_serialized at ./loading.jl:685
_require_from_serialized at ./loading.jl:736
_require at ./loading.jl:1023
require at ./loading.jl:911
require at ./loading.jl:906
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1630 [inlined]
call_require at /home/chriselrod/Documents/languages/julia/src/toplevel.c:399 [inlined]
eval_import_path at /home/chriselrod/Documents/languages/julia/src/toplevel.c:436
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:656
jl_eval_module_expr at /home/chriselrod/Documents/languages/julia/src/toplevel.c:181
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:640
jl_parse_eval_all at /home/chriselrod/Documents/languages/julia/src/ast.c:873
jl_load at /home/chriselrod/Documents/languages/julia/src/toplevel.c:878 [inlined]
jl_load_ at /home/chriselrod/Documents/languages/julia/src/toplevel.c:885
include at ./boot.jl:328 [inlined]
include_relative at ./loading.jl:1094
include at ./Base.jl:31
jl_apply at /home/chriselrod/Documents/languages/julia/src/julia.h:1630 [inlined]
do_call at /home/chriselrod/Documents/languages/julia/src/interpreter.c:328
eval_value at /home/chriselrod/Documents/languages/julia/src/interpreter.c:417
eval_stmt_value at /home/chriselrod/Documents/languages/julia/src/interpreter.c:368 [inlined]
eval_body at /home/chriselrod/Documents/languages/julia/src/interpreter.c:760
jl_interpret_toplevel_thunk_callback at /home/chriselrod/Documents/languages/julia/src/interpreter.c:888
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f4e2d60e40f)
unknown function (ip: (nil))
jl_interpret_toplevel_thunk at /home/chriselrod/Documents/languages/julia/src/interpreter.c:897
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:814
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/julia/src/toplevel.c:764
jl_toplevel_eval at /home/chriselrod/Documents/languages/julia/src/toplevel.c:823 [inlined]
jl_toplevel_eval_in at /home/chriselrod/Documents/languages/julia/src/toplevel.c:843
jl_eval_string at /home/chriselrod/Documents/languages/julia/src/jlapi.c:94
juliacall_cmd at /tmp/RtmpjJ9ldQ/R.INSTALL2800651bf90f/JuliaCall/src/JuliaCall.cpp:31
_JuliaCall_juliacall_cmd at /tmp/RtmpjJ9ldQ/R.INSTALL2800651bf90f/JuliaCall/src/RcppExports.cpp:26
unknown function (ip: 0x7f4e5bb67b1c)
unknown function (ip: 0x7f4e5bafc8ec)
Rf_eval at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb22c74)
Rf_applyClosure at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb07570)
Rf_eval at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb22c74)
Rf_applyClosure at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb07570)
Rf_eval at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb22c74)
Rf_applyClosure at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
Rf_eval at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5bb1b0b7)
Rf_eval at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
Rf_ReplIteration at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
unknown function (ip: 0x7f4e5ba9edd0)
run_Rmainloop at /usr/lib64/R/lib/haswell/avx512_1/libR.so (unknown line)
main at /usr/lib64/R/bin/exec/R (unknown line)
__libc_start_main at /usr/src/debug/glibc-2.29/csu/../csu/libc-start.c:308
_start at /usr/lib64/R/bin/exec/R (unknown line)
Allocations: 3098421 (Pool: 3097927; Big: 494); GC: 2
fish: “R” terminated by signal SIGSEGV (Address boundary error)

chriselrod avatar Aug 06 '19 23:08 chriselrod

@chriselrod Thanks for the feedback. I recently also saw breakdown on Julia 1.3 and is investigating into this. To be sure the problem is not with MKL, does it work with Julia 1.2+MKL? And is it okay for using RCall in Julia 1.3?

Non-Contradiction avatar Aug 07 '19 00:08 Non-Contradiction

And is it okay for using RCall in Julia 1.3? No https://github.com/JuliaInterop/RCall.jl/issues/326 I realized after my last comment that JuliaCall requires RCall, otherwise I would have tried to resolve that issue first.

To be sure the problem is not with MKL, does it work with Julia 1.2+MKL? I am at a hotel with painfully slow internet (about 0.02 mbps), so it's hard to test different Julia versions.

But even the official Julia 1.1 binary I already have installed fails on RCall:

(v1.1) pkg> build RCall
  Building Conda → `~/.julia/packages/Conda/kLXeC/deps/build.log`
  Building RCall → `~/.julia/packages/RCall/iojZI/deps/build.log`
┌ Error: Error building `RCall`: 
│ ERROR: LoadError: R cannot be found. Set the "R_HOME" environment variable to re-run Pkg.build("RCall").
│ Stacktrace:
│  [1] error(::String) at ./error.jl:33
│  [2] top-level scope at /home/chriselrod/.julia/packages/RCall/iojZI/deps/build.jl:49
│  [3] include at ./boot.jl:326 [inlined]
│  [4] include_relative(::Module, ::String) at ./loading.jl:1038
│  [5] include(::Module, ::String) at ./sysimg.jl:29
│  [6] include(::String) at ./client.jl:403
│  [7] top-level scope at none:0
│ in expression starting at /home/chriselrod/.julia/packages/RCall/iojZI/deps/build.jl:10
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:1075

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

julia> ENV["R_HOME"]
"/usr/lib64/R"

chriselrod avatar Aug 07 '19 00:08 chriselrod