scikit-build-core icon indicating copy to clipboard operation
scikit-build-core copied to clipboard

Bundling libpython.so with the installation package

Open rvijayc opened this issue 3 months ago • 4 comments
trafficstars

I am using scikit-build-core to create a cmake-based wheel which includes executable files that depend on libpython.so.

I am using the following approach to set the RPATH relative to origin:

  set(rpath_entries 
    "$ORIGIN/../lib" 
    "$ORIGIN/../lib/python${Python3_VERSION_MAJOR}.${Python3_VERSION_MINOR}/site-packages"
    "$ORIGIN/../lib/python${Python3_VERSION_MAJOR}.${Python3_VERSION_MINOR}/site-packages/lib"
  )

I noticed that the above approach isn't sufficient in virtual environments which don't contain libpython.so within the virtual environment and instead rely on the system installed python paths instead. For example:

(.venv) $ ldd /prj/.../project/.venv/bin/python3
        linux-vdso.so.1 (0x00007fffcacea000)
        libpython3.10.so.1.0 => /pkg/.../software/python/sles12/3.10.9/lib/libpython3.10.so.1.0 (0x00007fb9fbe00000)
        libcrypt.so.1 => /usr/lib64/libcrypt.so.1 (0x00007fb9fc524000)
        ...

What is the best way to handle this?

I tried to bundle libpython.so by directly using install(FILES ...), but the wheel constructed does not include the libpython.so. I am guessing it gets filtered by scikit-build-core.

find_package(Python3 COMPONENTS Interpreter Development REQUIRED)

# Install libpython into the wheel
install(FILES ${Python3_LIBRARIES}
        DESTINATION lib
        COMPONENT Runtime)

I did the following workaround (suggested by Claude AI) which works:

function(bundle_python)
    find_package(Python3 COMPONENTS Interpreter Development REQUIRED)
    
    get_filename_component(PYTHON_LIB_NAME ${Python3_LIBRARIES} NAME)
    set(COPIED_LIB_PATH ${CMAKE_CURRENT_BINARY_DIR}/${PYTHON_LIB_NAME})
    
    # Create a custom command that produces the copied file
    add_custom_command(
        OUTPUT ${COPIED_LIB_PATH}
        COMMAND ${CMAKE_COMMAND} -E copy_if_different
                ${Python3_LIBRARIES} ${COPIED_LIB_PATH}
        DEPENDS ${Python3_LIBRARIES}
        COMMENT "Copying libpython for packaging"
    )
    
    # Create a target that depends on the copied file
    add_custom_target(bundle_python_lib ALL
        DEPENDS ${COPIED_LIB_PATH}
    )
    
    # Install the copied file
    install(FILES ${COPIED_LIB_PATH}
            DESTINATION lib
            COMPONENT Runtime)
endfunction()

# force python library installation.
bundle_python()

I am guessing that scikit-build-core needs a target for a particular file to be included in the wheel and a raw install(FILES ... doesn't seem to work.

I am wondering if there is a "canonical" way of handling this? Or, is this workaround adequate?

Thanks in advance for any help!

rvijayc avatar Aug 06 '25 22:08 rvijayc

So, the solution I ended up using was to create a wrapper Python script that LD_PRELOADs libpython.so before calling the native executable. Users would simply call the wrapper script instead of directly running the executable.

rvijayc avatar Aug 09 '25 18:08 rvijayc

I missed this one. Hmm, executables that link directly to libpython indeed would be hard to handle. I'm curious what is your application like that you need to link to libpython, and what is the environment that the library is not already on the system path?

The scriplet approach might be the only way to do this. There may be better path consistency from site-packages to the library path which you could use in the RPATH (i.e. you install the binary in the site-packages with relative RPATH and you have the scriplet run the binary from there), but I'm not sure if it's guaranteed.

LecrisUT avatar Aug 09 '25 18:08 LecrisUT

I missed this one. Hmm, executables that link directly to libpython indeed would be hard to handle. I'm curious what is your application like that you need to link to libpython, and what is the environment that the library is not already on the system path?

I'd say that my situation is quite uncommon and probably not worth supporting. We have a C++ wrapper around numpy that we call by embedding the Python interpreter, and this wrapper is used in various standalone x86 executables. Hence, the RPATH needs to point to libpythonXX.YY.so.

I am also working in a corporate environment where Python(s) are installed some standard network path ( /pkg/.../software/python/sles12/3.10.9/lib/libpython3.10.so.1.0) which isn't automatically included in the user's library search paths. When I create a virtual environment (python3 -m venv .venv), the libpythonXX.so doesn't get copied into the virtual environment and hence I cannot use the paths relative to ORIGIN either.

The scriplet approach might be the only way to do this. There may be better path consistency from site-packages to the library path which you could use in the RPATH (i.e. you install the binary in the site-packages with relative RPATH and you have the scriplet run the binary from there), but I'm not sure if it's guaranteed.

So, yes. That was my conclusion after searching and chatting with AI bots. The following quote from Claude AI seems to indicate that this may be standard practice:

Recommended Approach I strongly recommend Option 1 - convert your executables to Python scripts that import C++ extensions. This is the standard pattern used by most Python packages:

NumPy: numpy.f2py is a Python script that calls C extensions PyTorch: torch.compile is Python calling C++ backends SciPy: Command-line tools are Python scripts

rvijayc avatar Aug 10 '25 17:08 rvijayc

That was my conclusion after searching and chatting with AI bots. The following quote from Claude AI seems to indicate that this may be standard practice:

As usual, take what it says with a mountain of salt. This is far from standard practice, e.g. scipy does not have cli.

The reason for using a python script wrapper is that you have the paths of both the site-packages and python libraries that you can interrogate and fixup. It is not ideal because it creates some overhead, and if your application does not have long execution time, this adds up significantly.

I am also working in a corporate environment where Python(s) are installed some standard network path ( /pkg/.../software/python/sles12/3.10.9/lib/libpython3.10.so.1.0) which isn't automatically included in the user's library search paths.

I wonder why they don't provide an Lmod or equivalent to fixup LD_LIBRARY_PATH et. al. for working in each environments. There are also other solutions like populating a ld.soconf.d

LecrisUT avatar Aug 10 '25 18:08 LecrisUT