scikit-build-core
scikit-build-core copied to clipboard
Request for Example: Linking Python Module with Shared Library using scikit-build-core and pybind11/nanobind
Hi scikit-build-core team! I'm working on a project where I need to connect my Python code with a shared library (like my_shared.so), and I'd love to use scikit-build-core for this, along with either pybind11 or nanobind (I prefer nanobind) for the bindings.
I've been looking through the documentation, but a clear example showing this specific workflow would be really helpful. It would be great to see how to set up a project with scikit-build-core, build a simple shared library with C/C++ functions, generate Python bindings, configure the linking process, and finally, use those functions in a Python script.
My goal is to bring a custom C/C++ library into my Python project, and I think scikit-build-core with pybind11 or nanobind is the perfect tool for the job. An example would make it much easier to understand and use this approach.
Having this example would be a huge help to the scikit-build-core community, especially for those who are new to linking with shared libraries. It would simplify things and enable developers to use their existing C/C++ code in Python projects more effectively.
Thank you for considering this! A practical example would be a valuable addition to the documentation, showing the real-world power of scikit-build-core.
Disclaimer: not a scikit-build-core developer, just a happy user
Honestly, the documentation on CMake, scikit-build-core and pybind11/nanobind makes this procedure seem more "magical" than it actually is. Once you actually try it, it's pretty straightforward to apply it everywhere. I'm assuming you are aiming for making your Python wheels redistributable, so in the below I will assume you are using Apple Clang on MacOS, and GCC on GNU/Linux (I don't have too much experience on Windows so I'll only provide limited remarks). It should apply to both C and C++ extensions. I will not cover how to write your Python bindings since that's explained in great detail in both nanobind and pybind11 docs. Instead, I will mainly focus on gluing all of this together with CMake and scikit-build-core.
Building Python extensions with CMake
The CMake part is relatively straightforward, and is composed of the following procedure:
- use
find_package(Python COMPONENTS Interpreter Development.Module)to let CMake find the Python library and headers - (if using the numpy C API) add
NumPyto theCOMPONENTSabove - (if using Cython modules) convert your
.pyxCython modules to.cppfiles before doing any of the below - create a new CMake module library via
add_library(pytarget MODULE <sources>), where<sources>is a list of C/C++ sources of your extension - if this Python module uses one (or more) of your C++ libraries, link to it with
target_link_libraries(pytarget PRIVATE <cpp_libs>)- note that, unlike pybind11, nanobind is not a header-only library, so if you are using it, you will need to actually link to it via
target_link_libraries(pytarget PRIVATE nanobind)(how you build nanobind is up to you, the cods do a great job explaining it)
- note that, unlike pybind11, nanobind is not a header-only library, so if you are using it, you will need to actually link to it via
- since you are making a Python extension, somewhere along the line (be it directly or indirectly with nanobind/pybind11) you are using Python headers, so you need to add
target_include_directories(pytarget PRIVATE "${Python_INCLUDE_DIRS}")- (if using the numpy C API) add
target_include_directories(pytarget PRIVATE "${Python_NumPy_INCLUDE_DIRS}") - (if using pybind11/nanobind) add
target_include_directories(pytarget PRIVATE <dir>), where<dir>is the location of the pybind11/nanobind headers
- (if using the numpy C API) add
- since you must not link with the Python libraries, the linker may complain that there are missing symbols; you can resolve this by adding:
target_link_options(pytarget PRIVATE "-Wl,-undefined,dynamic_lookup")on MacOStarget_link_options(pytarget PRIVATE "-Wl,--unresolved-symbols=ignore-all")on GNU/Linux- on Windows, I think you do need to link to Python explicitly (at least, there's this line from the docs: "When creating DLLs in Windows, you must pass pythonXY.lib to the linker"), which can be achieved using
target_link_libraries(pytarget PRIVATE "${Python_LIBRARIES}")
- by default, CMake will name your library file something like
libpytarget.so(on Linux anyway), but in order to import it asimport pytarget, you need to change the actual name of the library. This can be done usingset_target_properties(pytarget PROPERTIES OUTPUT_NAME <module> PREFIX ""). Replace<module>with however you want to call your library using Python'simportstatement. Note that you may need to change the suffix by addingSUFFIX <suffix>, where<suffix>is whatever theEXTENSION_SUFFIXESdictate (on Windows it apparently needs to be.pyd, while both MacOS and Linux seem to be fine with the default.so). For more details on what is the expected suffix on your platform, see the output ofpython -c 'from importlib.machinery import EXTENSION_SUFFIXES;print(EXTENSION_SUFFIXES)' - if you are linking your Python module with other C++ libraries, you will need to modify where the module looks for those libraries (assuming they are not in the same directory as your Python module) by modifying the runtime search path (so called
rpath). This can be done with:set_target_properties(pytarget PROPERTIES INSTALL_RPATH "\\\$ORIGIN/relative/path/to/lib1:\\\$ORIGIN/relative/path/to/lib2")on GNU/Linuxset_target_properties(pytarget PROPERTIES INSTALL_RPATH "@loader_path/relative/path/to/lib1:@loader_path/relative/path/to/lib2")on MacOS- (a note on the above for the curious: the
$ORIGINand@loader_pathbasically resolve to where the current module (pytarget) is located, at run-time, i.e. when you load it viaimport)
installyour library as usual (though read below for caveats)
scikit-build-core and CMake/pyproject.toml
Most of the config I've used on the Python side is via pyproject.toml, and the scikit-build-core docs do a pretty fine job of explaining the procedure. On the other hand, the CMake part was personally a bit less clear, so here are some potential pitfalls to avoid:
- if you've ever used
install(... ${CMAKE_INSTALL_PREFIX}/some/path), stop. scikit-build-core seems to run the config stage of CMake without specifyingCMAKE_INSTALL_PREFIX, and then runs the install stage viacmake --install <build_dir> --prefix <some temp dir>/platlib/. Hence, if you do useinstall(... ${CMAKE_INSTALL_PREFIX}), CMake will insert the default install prefix (usually some variant of/usr/or/usr/local) at configure time instead, and your files will not end up where they are supposed to (i.e. in a directory that ends up in a wheel). The fix is to use a relative install path instead (some/path) - one problem that can arise (for instance, due to the relatively large complexity of the project) is the fact that a regular CMake install and a Python wheel can have different directory layouts, and if you are shipping your C++ libraries in the project, you will need to put them somewhere in the wheel. Since Python only loads modules from directories that don't have an
__init__.py, you can basically put them anywhere, as long as it's a subdirectory of your package name. Make sure to modifyINSTALL_RPATHaccordingly for all of your libraries and binaries though! To keep things simple, it's usually a good idea to ship the non-Python parts in a single directory structure as usual, and then add something likeif(SKBUILD) set(CMAKE_INSTALL_RPATH "<flag>/../lib") endif(), where<flag>is either\\\$ORIGIN(GNU/Linux) or@loader_path(MacOS). This will automatically set rpaths for any libraries/executables that haven't had an explicitly set rpath already (as we explicitly did above with the Python extension) - the discovery of pure Python modules that you are shipping could be handled by scikit-build-core, but I found it simpler to set
wheel.packages = [ ]and just handle this within CMake. As a concrete example, we install all of our Python files and extensions under<package>, and all of the rest goes to<package>/.data(so we have<package>/.data/lib,<package>/.data/bin, etc.) - the default setting
install.strip = trueshipped a broken MacOS wheel for us (not really a fault of scikit-build-core, but nonetheless it was a very annoying problem to diagnose), so if you are getting missing symbols at load-time, try usinginstall.strip = falseinstead - if you are shipping any wrapper scripts, make sure to install them in
SKBUILD_SCRIPTS_DIR
Oh, many appologise that we didn't catch this issue. Great writeup on this @JCGoran
Minor notes:
find_package(Python COMPONENTS Interpreter)is often unnecessary and may interfere with cross-compilation. Do not add it unless you really need to.- About using
Python_INCLUDE_DIRS, please don't. Use one of thepython_add_library/pybind11_add_module/nanobind_add_module. Similarly forPython_LIBRARIES. Instead beware of the differences betweenMODULEandSHAREDoptions in those functions (or their alternatives)MODULEis something that you can consume withimport ...in the python codeSHARED/STATICis meant to be consumed by aMODULEand they python variants of these are there if you need to link explicitly to the python libraries to function. If you are only interested in the defines, such as the python ABI define, then it's unfortunate that you would need to do that manually.
- The naming of the library is also handled by the above
- About the
RPATHnote, consider if you want to do that manually or via wheel repairs likeauditwheel. Each approach has their own benefits
- the default setting
install.strip = trueshipped a broken MacOS wheel for us
Can you open an issue to investigate this? Afaiu strip is about debug symbols, not runtime symbols.
I think you misunderstood my question. My issue isn't about connecting scikit-build-core with nanobind or pybind11.
Let’s say I’ve downloaded the .h and .so files from ONNX Runtime v1.21.0, or I already have some add_library(... SHARED ...) targets in my project that I want to use.
There are many non-trivial linking challenges when writing the install() rules and importing the resulting Python modules — especially because this is not as straightforward as with static libraries.
A complete and solid example of this workflow is badly needed in this area.