cctbx_project
cctbx_project copied to clipboard
Segmentation faults with numpy 1.21 package
The recently released 1.21 version of numpy
will cause segmentation faults with the cctbx-base
conda package. This affects Python versions 3.7 through 3.9. Please use version 1.20 until this issue is resolved. Python 3.6 is using version 1.19.
Can you add a numpy<1.21 constraint to the conda-forge package and yank the unconstrained release?
This currently is breaking builds all over the place.
I can build a new build that adds the constraint. But we don't have to remove the old one. I'm trying to determine the underlying issue.
https://github.com/conda-forge/cctbx-base-feedstock/pull/26
Can always update it again once the problem is discovered/resolved
The new build that does not update to numpy 1.21 should be available. You may need to wait for the CDN to update for the package to be widely available.
Do we know what the origin of this issue is, and is there any prospect for a fix?
We have seen segmentation faults with numpy
before. It should be related to an initialization. Not everything that uses numpy
causes a segmentation fault. I will narrow down which additional parts need an initialization.
Did we find out why this happened?
Have not had time to took further into this yet.
Just adding to the story, a simple segfault reproducer with a numpy 1.21 build is
from cctbx import crystal
crystal.symmetry("79,79,38,90,90,90", "P43212")
Just flagging this boost discussion: https://github.com/boostorg/python/issues/376 which I assume is about the same issue.
Here's an excerpt from a stack trace when triggering the segfault via Derek's reproducer:
stack trace
Program received signal SIGSEGV, Segmentation fault.
PyDict_GetItemWithError () at /tmp/build/80754af9/python_1627392990942/work/Objects/dictobject.c:1371
1371 /tmp/build/80754af9/python_1627392990942/work/Objects/dictobject.c: No such file or directory.
(gdb) bt
#0 PyDict_GetItemWithError () at /tmp/build/80754af9/python_1627392990942/work/Objects/dictobject.c:1371
#1 0x00007f331e7f2cef in PyArray_GetCastingImpl ()
from /dev/shm/dwpaley/test/conda_base/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so
#2 0x00007f331e7f31f8 in PyArray_GetCastSafety ()
from /dev/shm/dwpaley/test/conda_base/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so
#3 0x00007f331e89284b in PyArray_EquivTypes.part.6 ()
from /dev/shm/dwpaley/test/conda_base/lib/python3.7/site-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so
#4 0x00007f331eb9d472 in boost::python::numpy::equivalent (a=..., b=...) at /dev/shm/dwpaley/test/modules/boost/libs/python/src/numpy/dtype.cpp:125
#5 0x00007f331eb9dfed in boost::python::numpy::(anonymous namespace)::array_scalar_converter<int>::convertible (obj=0x7f3321983130)
at /dev/shm/dwpaley/test/modules/boost/libs/python/src/numpy/dtype.cpp:162
#6 0x00007f3320df97e8 in boost::python::converter::rvalue_from_python_stage1 (source=0x7f3321983130, converters=...)
at /dev/shm/dwpaley/test/modules/boost/libs/python/src/converter/from_python.cpp:54
#7 0x00007f3320f0d402 in boost::python::converter::arg_rvalue_from_python<int>::arg_rvalue_from_python (this=0x7ffd7f289e30, obj=0x7f3321983130)
at /dev/shm/dwpaley/test/modules/boost/boost/python/converter/arg_from_python.hpp:297
#8 0x00007f3320f0b47b in boost::python::arg_from_python<int>::arg_from_python (this=0x7ffd7f289e30, source=0x7f3321983130)
at /dev/shm/dwpaley/test/modules/boost/boost/python/arg_from_python.hpp:70
#9 0x00007f331a0f166c in boost::python::detail::caller_arity<2u>::impl<void (*)(_object*, int), boost::python::default_call_policies, boost::mpl::vector3<void, _object*, int> >::operator() (this=0x557576f5fef8, args_=0x7f33198337d0)
at /dev/shm/dwpaley/test/modules/boost/boost/preprocessor/iteration/detail/local.hpp:37
#10 0x00007f331a0f087b in boost::python::objects::caller_py_function_impl<boost::python::detail::caller<void (*)(_object*, int), boost::python::default_call_policies, boost::mpl::vector3<void, _object*, int> > >::operator() (this=0x557576f5fef0, args=0x7f33198337d0, kw=0x0)
at /dev/shm/dwpaley/test/modules/boost/boost/python/object/py_function.hpp:38
#11 0x00007f3320e083cb in boost::python::objects::py_function::operator() (this=0x557576f5ff20, args=0x7f33198337d0, kw=0x0)
at /dev/shm/dwpaley/test/modules/boost/boost/python/object/py_function.hpp:147
#12 0x00007f3320e06159 in boost::python::objects::function::call (this=0x557576f5ff10, args=0x7f33198337d0, keywords=0x0)
at /dev/shm/dwpaley/test/modules/boost/libs/python/src/object/function.cpp:221
#13 0x00007f3320e076df in boost::python::objects::(anonymous namespace)::bind_return::operator() (this=0x7ffd7f28a120)
at /dev/shm/dwpaley/test/modules/boost/libs/python/src/object/function.cpp:581
#14 0x00007f3320e080c4 in boost::detail::function::void_function_ref_invoker0<boost::python::objects::(anonymous namespace)::bind_return, void>::invoke (
function_obj_ptr=...) at /dev/shm/dwpaley/test/modules/boost/boost/function/function_template.hpp:193
#15 0x00007f3320e1f76e in boost::function0<void>::operator() (this=0x7ffd7f28a0d0)
at /dev/shm/dwpaley/test/modules/boost/boost/function/function_template.hpp:763
#16 0x00007f3320e1ef4c in boost::python::handle_exception_impl (f=...) at /dev/shm/dwpaley/test/modules/boost/libs/python/src/errors.cpp:25
#17 0x00007f3320e07d50 in boost::python::handle_exception<boost::python::objects::(anonymous namespace)::bind_return> (f=...)
at /dev/shm/dwpaley/test/modules/boost/boost/python/errors.hpp:29
#18 0x00007f3320e077ba in boost::python::objects::function_call (func=0x557576f5ff10, args=0x7f33198337d0, kw=0x0)
at /dev/shm/dwpaley/test/modules/boost/libs/python/src/object/function.cpp:622
#19 0x0000557574a1c13f in _PyObject_FastCallDict () at /tmp/build/80754af9/python_1627392990942/work/Objects/call.c:125
#20 0x0000557574a31041 in _PyObject_Call_Prepend (kwargs=0x0, args=0x7f3328d12790, obj=<optimized out>, callable=0x557576f5ff10)
at /tmp/build/80754af9/python_1627392990942/work/Objects/call.c:906
Great! I was not able to reproduce Derek's crash, but I was able to find another simple way of causing the segfault. There is also an earlier discussion here.
https://github.com/epics-base/pvaPy/issues/63
For me, using Derek's reproducer, I can avoid crashing with a couple different changes in sgtbx/boost_python/symbols.cpp
:
As written now, we have:
struct space_group_symbols_wrappers
{
typedef space_group_symbols w_t;
static void
wrap()
{
using namespace boost::python;
typedef return_value_policy<copy_const_reference> ccr;
class_<w_t>("space_group_symbols", no_init)
.def(init<std::string const&, optional<std::string const&> >((
arg("symbol"),
arg("table_id")="")))
.def(init<int, optional<std::string const&, std::string const&> >((
arg("space_group_number"),
arg("extension")="",
arg("table_id")="")))
.def("number", &w_t::number)
[...];
}};
If I comment out the second constructor, or if I remove optional<...>
and the default args from the second constructor so that it looks like this:
class_<w_t>("space_group_symbols", no_init)
.def(init<std::string const&, optional<std::string const&> >((
arg("symbol"),
arg("table_id")="")))
.def(init<int, std::string const&, std::string const& >((
arg("space_group_number"),
arg("extension"),
arg("table_id"))))
then no crash. So clearly it's something about the overload resolution for the space_group_symbols class as kinda suggested by the boost issue I linked before.
Pretty weird that numpy would have anything to do with it! I'm also curious how widespread this is: both the pattern of mixing overloaded constructors with boost optional arguments, and whether they all cause segfaults now.
I added a comment on the Boost issue I mentioned above (https://github.com/boostorg/python/issues/376) but not sure if it gets us any closer to a fix. The issue started with changes to numpy type casting here: https://github.com/numpy/numpy/pull/17401
It appears to be a numpy bug and I describe a possible fix here: https://github.com/boostorg/python/issues/376 The problem involved dereferencing a null pointer when checking convertibility of types (like boost_python ones) that haven't implemented the new numpy casting implementation.
I'll open a numpy PR which I assume will take a while to get into a release. It's possible to build a custom numpy from sources and we can discuss if necessary, but it seems like our stuff is stable for now with the pin to 1.20...
This appears to be fixed as of numpy 1.21.5, which is now on conda-forge :)
Yeah, I've been following the discussion. But it looks like there should still be an update to Boost.Python. Thanks for getting the ball rolling!
I should be able to add Python 3.10 checks to Azure Pipelines now that there is a way forward.
And I can remove the numpy
version limit in the conda package for the next release.
Unpinning this since there is a fix in numpy
1.21.5 and later. The nightly package builds should find any future incompatibilities (conda-forge
pinnings during build and latest packages in the tests).