pyjulia icon indicating copy to clipboard operation
pyjulia copied to clipboard

signal (11): Segmentation fault in PyJulia 0.4.1,Python 2/3,Julia 1.1.0

Open casv2 opened this issue 5 years ago • 9 comments

Running this simple python script

from julia import Julia
import numpy as np

ju = Julia()
ju.include("LJ.jl")

for j in range(100000):
    e = ju.LJ.energy(np.random.rand(10,3))
    if j % 50 == 0:
        print(j,e)

where LJ.jl is the following Julia script

module LJ
using LinearAlgebra

function energy(R::Matrix{T}) where {T}
   E = zero(T)
   for i = 1:size(R, 2)-1, j = i+1:size(R,2)
      r= norm(R[:,i] - R[:,j])
      E += r^(-12) - 2 * r^(-6)
   end
   return E
end
end

gives me the following error:

signal (11): Segmentation fault
in expression starting at no file:0
ConvParam at /usr/local/src/conda/python-3.7.3/Modules/_ctypes/callproc.c:685 [inlined]
_ctypes_callproc at /usr/local/src/conda/python-3.7.3/Modules/_ctypes/callproc.c:1132
PyCFuncPtr_call at /usr/local/src/conda/python-3.7.3/Modules/_ctypes/_ctypes.c:3969
_PyObject_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyObject_Call_Prepend at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyObject_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
object_vacall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
PyObject_CallFunctionObjArgs at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
slot_tp_getattr_hook at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
builtin_getattr at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyMethodDef_RawFastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyCFunction_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
function_code_fastcall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyObject_Call_Prepend at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyObject_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
object_vacall at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
PyObject_CallFunctionObjArgs at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
slot_tp_getattr_hook at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
PyEval_EvalCodeEx at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
PyEval_EvalCode at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
builtin_exec at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyMethodDef_RawFastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyCFunction_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallKeywords at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
call_function.lto_priv.1536 at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalFrameDefault at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyEval_EvalCodeWithName at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
_PyFunction_FastCallDict at /root/miniconda3/lib/libpython3.7m.so.1.0 (unknown line)
macro expansion at /opt/julia/share/site/packages/PyCall/0jMpb/src/exception.jl:81 [inlined]
__pycall! at /opt/julia/share/site/packages/PyCall/0jMpb/src/pyfncall.jl:44
_pycall! at /opt/julia/share/site/packages/PyCall/0jMpb/src/pyfncall.jl:29
#call#89 at /opt/julia/share/site/packages/PyCall/0jMpb/src/pyfncall.jl:11 [inlined]
PyObject at /opt/julia/share/site/packages/PyCall/0jMpb/src/pyfncall.jl:89
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1864
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:323
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:411
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:625
jl_interpret_toplevel_thunk_callback at /buildworker/worker/package_linux64/build/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f7b815ad9af)
unknown function (ip: 0x2)
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:894
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:764
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:713
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:793
eval at ./boot.jl:328
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
exec_options at ./client.jl:243
_start at ./client.jl:436
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2219
unknown function (ip: 0x40191d)
unknown function (ip: 0x401523)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4015c4)
Allocations: 34898355 (Pool: 34895806; Big: 2549); GC: 74
Segmentation fault

I've tested this several times and on a range of different machines (Ubuntu, MacOS). The error pops up every time after a several hundred thousand iterations. The error also pops up using Python 2. The Julia version is 1.1.0.

I've also tried using python-jl and it still gives raise to the segmentation fault.

casv2 avatar Jun 25 '19 10:06 casv2

I can reproduce the error. But I don't have time right now to look deeply into this.

FYI,

from julia import Main

energy = Main.eval("""
using LinearAlgebra

function energy(R::Matrix{T}) where {T}
   E = zero(T)
   for i = 1:size(R, 2)-1, j = i+1:size(R,2)
      r= norm(R[:,i] - R[:,j])
      E += r^(-12) - 2 * r^(-6)
   end
   return E
end
""")


import numpy as np

for j in range(100000):
    e = energy(np.random.rand(10,3))
    if j % 50 == 0:
        print(j,e)

runs fine. It could be that avoiding the dynamic lookup ju.LJ.energy helps.

tkf avatar Jun 25 '19 21:06 tkf

It could be that avoiding the dynamic lookup ju.LJ.energy helps

Is there a way to avoid this regardless of where/how energy is defined? In practise I'll have a module M - possibly quite complex - defined in Julia from which I'll want to call several functions. But as a workaround it should be feasible defining an interface, e.g.

from julia import Main

energy_py = Main.eval("""
function energy_py(R::Matrix{T}) where {T}
   return JL.energy(R)
end
""")

cortner avatar Jun 26 '19 10:06 cortner

Main.eval is not critical here. How about

from julia import Main
Main.include("path/to/script.jl")  # defines YourModule
your_function = Main.YourModule.your_function

If you can make your module importable in Julia,

from julia.YourModule import your_function

should also work.

tkf avatar Jun 26 '19 19:06 tkf

What is the status of this bug? I narrowed this down to a simple example but I'm trying to use a large module with lots of dependencies which makes this workaround not very practical.

casv2 avatar Jul 14 '19 22:07 casv2

What do you mean it's not practical? You mean you don't want to write a lot of imports like this?

from julia.YourModule import (
    your_function_1,
    your_function_2,
    your_function_3,
    ...,
    your_function_100,
)

In this case, you can just create a shim object using cached_property. Or, just use functools.lru_cache()(julia.YourModule.__getattr__).

Note that you should be using from julia.YourModule import your_function for performance sensitive code anyway to avoid pressuring the GCs. Also, calling tons of Julia function is not ideal for performance, too. Perhaps you should decrease API surface by writing a Julia function.

tkf avatar Jul 14 '19 22:07 tkf

Got it. My code now works and this way of calling Julia seems to be much faster and doesn't seg fault. I guess we can close this now?

Thank you for the support, much appreciated!

casv2 avatar Jul 15 '19 15:07 casv2

Let's keep the issue. It's still bad to segfault when it shouldn't.

tkf avatar Jul 15 '19 23:07 tkf

Hi, I am unsure whether to open a new bug report or not on this, but this issue is still open and I am currently experiencing it despite following the advice above. I have not yet had the time to produce a minimal example, but I will try to briefly outline the setup. I am using Python 3.7.3, Julia 1.2.0 and PyJulia 0.4.1. In accordance with the tips above I use julia.eval to define the functions I need to use, once at the start of the Python program. However it frequently happens that the interpreter segfaults the first time one of these functions is called with the same error message as above, and I have to check the console and restart the program when this happens.

This seems to happen much more frequently when I try to combine PyJulia with multiprocessing - here I found that explicitly reloading the entire Julia code in each child process reduces the probability of a segfault, but they still occur rather frequently.

I can try to deliver more information as needed, it would be fantastic if somebody could have a second look at this issue!

kaandocal avatar Mar 05 '20 17:03 kaandocal

Commenting to say this is still broken by calling a function via

from julia.api import Julia
jl = Julia(compiled_modules=False)
from julia import Main
Main.include("fastsum.jl")
from julia.Main import greenfunction

where the function itself is

using Tullio, LoopVectorization
function greenfunction(mu, wns, sigwns, energy, dosnorm)
    new_array = Array{ComplexF64}(undef, length(wns))
    @tullio threads=false new_array[i] = 1 / (mu + wns[i] * 1im - energy[j] - sigwns[i]) * dosnorm[j]
    dosnorm = Nothing
    energy  = Nothing
    sigwns  = Nothing
    wns     = Nothing
    
    return new_array
end

Clearing some of the variables seems to make the number of processes I can run increase from something around 10 to 80, but eventually it still segfaults, with errors like those in #362

andrewkhardy avatar Jul 22 '21 14:07 andrewkhardy