nrn icon indicating copy to clipboard operation
nrn copied to clipboard

Python wheels impose vague and undocumented restrictions on the use of C++ features

Open olupton opened this issue 1 year ago • 1 comments

Overview

While working on https://github.com/neuronsimulator/nrn/pull/1922 I started seeing linker errors:

nrnmain.cpp:(.text.startup+0x98): undefined reference to `nrnmpi_load[abi:cxx11](int)

in the CI jobs that test the Linux Python wheels (which are also built as part of the CI). The relevant change in that PR was that I changed the return type of that function from char* to std::string.

In general, replacing C-style strings and manual memory management with standard types such as std::string is a Good Thing ™️, so it's unfortunate that this causes problems with use cases that we consider important (pip install neuron on Linux...). If supporting these use cases forbids the use of certain C++ features or types in certain places, this should be made explicit and documented.

Detailed description

The problem is that the Python wheels are built using an image based on manylinux2014 that is based on CentOS7, and that this only supports using a pre-C++11 ABI for std::string (link, link, ...). This means that the old ABI is used for code in libnrniv.so, which is shipped inside the wheels.

The typical workflow of running nrnivmodl after pip install neuron on a "user" machine builds C++ sources using the compiler toolchain that is installed on that "user" machine, which must support at least C++17 and will default to using a C++11-compatible std::string implementation. This causes link errors if any code compiled on the user machine tries to refer to symbols inside libnrniv.so that use std::string (for example, a function that returns std::string).

Trying to force the use of the old ABI on user machines inside nrnivmodl would probably be fragile, and would not avoid issues with linking to other software on the user machine that uses the new ABI. Dropping support for manylinux2014 would(?) cause issues with pip install neuron using vanilla Python 3.7.x.

These issues can be avoided by only using a restricted subset of C++ in functions/variables that may be called from code compiled on the user machine, but there is no(?) well-defined list of these functions/variables or enforcement mechanism beyond hoping that the wheel-based CI pipelines catch any infractions.

Further information

Some other links that may be of interest:

  • https://github.com/neuronsimulator/nrn/issues/511

olupton avatar Aug 22 '22 15:08 olupton

This came up again in https://github.com/neuronsimulator/nrn/pull/1929, where an intermediate commit had a non-inline version of https://github.com/neuronsimulator/nrn/blob/4ac3ed9e46a29af193c807c50c101f88537a83e1/src/nrniv/backtrace_utils.h#L13-L24 built into libnrniv.so that caused linker errors downstream when using the Python wheels and nrnivmodl.

olupton avatar Sep 20 '22 12:09 olupton