RCall.jl icon indicating copy to clipboard operation
RCall.jl copied to clipboard

Segmentation fault on closing julia

Open schlichtanders opened this issue 2 years ago • 7 comments

Hello, I just want to report that I run into a segmentation fault when closing Julia again (aftr using RCall).

[1847] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /usr/local/julia/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
unknown function (ip: 0x7f95e9ce41c9)
__libc_start_main at /lib/x86_[64](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:65)-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 32734049 (Pool: 32[70](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:71)1[76](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:77)1; Big: 32288); GC: 42

I have no minimal example, but I guess it could have something todo with me using an R function inside an async julia task, which somehow is not correctly finalized or prevents some other part from finalizing.

The above error occurs on a docker container build on top julia:1.9, while when I run it on my local laptop, the same code does not throw an error, but hangs infinitely.

schlichtanders avatar Nov 21 '23 15:11 schlichtanders

A similar segmentation fault was already reported to julialang https://github.com/JuliaLang/julia/issues/43556

it is about that switching tasks and calling Base.iolock_end() don't work well together , but I couldn't find iolock_end. Maybe some related function is nevertheless called, or some similar unexpected task switching happens.

schlichtanders avatar Nov 21 '23 15:11 schlichtanders

I was able to replicate the segmentation fault it is combination out of three components:

  • a julia object
  • an R function which might return this Julia object
  • ~~an async task which calls this R function~~ EDIT: this is actually not needed

Everything is fine until the julia session is closed - then the same segmentation fault is thrown

julia> struct SingletonType end

julia> Singleton=SingletonType()
SingletonType()

julia> using RCall

R> library(JuliaCall)

R> r_singleton = julia_eval("Singleton")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> rf = reval("""function(){
               if (runif(1) > 0.9){
                       r_singleton
               } else {
                       rnorm(1)
               }
       }""")
RObject{ClosSxp}
function () 
{
    if (runif(1) > 0.9) {
        r_singleton
    }
    else {
        rnorm(1)
    }
}
julia> rf()
RObject{RealSxp}
[1] 1.055938


julia> 

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13
[1]    2389910 segmentation fault (core dumped)  julia --project

schlichtanders avatar Nov 21 '23 15:11 schlichtanders

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

palday avatar Nov 21 '23 18:11 palday

I will test soon, whether I can circumvent this by starting it via R directly.

I further simplified the failing example - it is only about getting some julia value to R. Boom.

julia> using RCall

R> library(JuliaCall)

R> ftype = julia_eval("Function")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> 

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10

schlichtanders avatar Nov 22 '23 08:11 schlichtanders

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

A first try fails because I cannot find how to use a certain julia environment via JuliaCall. When first starting julia and then using RCall, it picks up the same julia session, in standalone I couldn't find any documentation about it.

EDIT: I found it. You need to set environment variable JULIA_PROJECT="..."

schlichtanders avatar Nov 22 '23 08:11 schlichtanders

I tested the examples now and it seems to work without Segfault if it is directly started via R. Looks like a good workaround for me.

Still, it is natural to expect that JuliaCall works inside RCall. In the python world PythonCall and JuliaCall also work together. It would be great if this Segfault could be solved. It is only the final exiting of julia - everything else works already.

schlichtanders avatar Nov 22 '23 09:11 schlichtanders

I know this has been my mantra lately ... but I'm wondering if JuliaCall needs to check to see whether there's an existing RCall session before creating a new one. (Why do I think it's JuliaCall's responsibility and not RCall's? Because JuliaCall depends on RCall but not vice versa. If there were a straightforward change we could make in RCall to make this easier, I would support it, but big changes in RCall tend to get stuck by very limited maintainer bandwidth.)

palday avatar Nov 22 '23 16:11 palday