openvdb
openvdb copied to clipboard
Add python packaging support
Summary
This PR adds python packaging support, allowing the python OpenVDB bindings to be installed with pip. It also includes github actions to automatically publish releases to the Python Package Index (PyPI). See #161 for original discussion.
Changes
- Added
pyproject.toml, which specifies build-time dependencies for the python package. - Added
setup.py, which makespyopenvdbinstallable bypip, and specifies the python side of the build recipe. - Added a new github action to build a source distribution of the pyopenvdb package, and upload it to PyPI. The build can be triggered either by publishing a new OpenVDB release, or manually by navigating on GitHub to
Actions->Build and Release Python Bindings to PyPI->Run Workflowand providing a branch and ref to build.
Additional Info
I explored a couple of options for generating binary python package distributions (wheels), but eventually opted only to generate source distributions (sdists). Here's what I tried:
- cibuildwheel, a project by the Python Packaging Authority, allows a user to easily build wheels for python packages for a range of python versions, platforms, and system architectures. It does this by using docker containers to provide build environments. These manylinux docker containers can generate wheels for a wide group of linux distributions, but in opting for widespread compatibility they sacrifice being up to date:
- We can't easily use
manylinux2014because it's based on CentOS 7, which only has boost 1.53 available in the official repositories, so we'd have to build it from scratch - We can't easily use
manylinux_2_24because it's based on Debian 9, and uses GCC 6.3 which can't build our dependencies.
I eventually gave up on cibuildwheel to explore other options.
2. Using ubuntu:22.04 to build wheels for Python 3.8, 3.9, and 3.10 seems like a good idea, and in fact I think it should in principle work - except we still need to build OpenVDB from source, which takes long enough on the github actions runners that the job times out.
So for this reason that I think it is best just to release an sdist to PyPI, which means that when a user installs the package with pip install pyopenvdb, pip will trigger a build of the shared library and copy it to the appropriate site-packages. sdists are commonly uploaded as "fallback" distributions on PyPI anyway, because they should be compatible with any system that has the appropriate dependencies, and are compiled at install time rather than at the time the package is published.
Update 2023-02-03
I've reworked this PR so that it now works with the new pybind11 bindings.
Changes
- Added a workflow to build and publish an sdist package to either
test.pypi.org(if aworkflow_dispatchevent is used ad the appropriate checkbox is clicked) or by default topypi.org, the main python package index. Publishing requires the presence of a github secret with a PYPI token, namedPYPI_API_TOKEN. If publishing totest.pypi.org, the secret that is used should be calledTEST_PYPI_API_TOKENI can meet to help out with this if the process is unclear. When a new release is made, this workflow will automatically build and publish the release to PYPI. - At the moment, there's no way to dynamically get the version with the build backend we are using; see https://github.com/scikit-build/scikit-build-core/issues/116 for more info. So whoever does the release will have to update the version number in
pyproject.tomlbefore publishing :/ hopefully this feature gets added to the build backend soon. - The python bindings will be published under the
openvdbpackage name, as discussed before. To install, you'd just dopip install openvdband to use them you'd doimport openvdb.
Todo
- [x] CLA
- [x] Sign-off the commits
- [x] Document requirements around using the github action to publish to PyPI (need to set up a PyPI token as a secret on the repository. OpenVDB maintainers should have access to the secrets on my fork, so you can see how to set it up.)
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: peytondmurray / name: Peyton Murray (aa67f3d0348b90d74ced920778a4e2d5b2464bc2, c2b9a69c0b8de55d74c9fac5dbed41a97e379517, 1523ba124c33bc4dc0af9befa1b175437cba871e, 270ba788bf8e7d769a1c0e6c3e516baa066990de, 9fa446ac250573527b79429243a8265808e0264f, 62ab6597314fa07cddc38316c05ccc26176b3edb, f69e5cd4c2ad6c1324f99e7d9a2c6feae3379766)
Hey @peytondmurray thanks a lot for your work on this, this is looking great! I need to spend a bit more time looking at what you have here but here's some initial comments:
- Interesting that the
cibuildwheelapproach is locked to those older versions, though I see the reasons behind it. Out of interest what dependencies failed to build with GCC 6.3 with themanylinux_2_24container? - Regrading the CI timeouts for
ubuntu:22.04that you were experiencing; we should be able to solve this. You can try turning-D USE_EXPLICIT_INSTANTIATION=OFFas this isn't really needed for the python bindings. Additionally, our actions support the use of a sharedccacheto speed up subsequent builds. IIRC each job has a 6hr timeout limit? If we had a build matrix of a wheel per job (if such a workflow make sense with the wheel approach), combined with the above, I'd very much hope we wouldn't hit timeout limits!
I do like the idea of the sdist approach, though the idea of deploying binaries was pretty appealing to reduce the barrier to entry w.r.t build issues. I'm not sure if this is still enough of a reason to not go the sdist approach. Are their genral guidelines to choose one or the other? Do most pypi packages support both types of installations?
Finally we can absolutely credit Alex for their work! In case I haven't mentioned, the TSC have weekly catch-ups every Tuesday (10:00am to 11:00am (UTC-07:00)). They're minuted, but informal and open to anyone should you or Alex wish to attend and discuss anything here. Reminders are posted here: https://lists.aswf.io/g/openvdb-dev/topics. Either way I will feedback your progress, thanks again!
@Idclip Thanks for the info - I think based on your feedback I'll try once more to build wheels using ubuntu:22.04, knowing that it may lower the barrier to entry for python users.
Out of interest what dependencies failed to build with GCC 6.3 with the manylinux_2_24 container?
OpenVDB itself depends on GCC>=6.3.1.
Are their general guidelines to choose one or the other? Do most pypi packages support both types of installations?
Most large PyPI projects publish both wheels and sdists; the wheels are faster and smaller to install, but the sdists have better compatibility because the compilation step happens at install time. pip has flags for letting users choose whether to install sdists (--no-binary) or bdists (--only-binary) if they have a preference.
In case I haven't mentioned, the TSC have weekly catch-ups every Tuesday (10:00am to 11:00am (UTC-07:00)). They're minuted, but informal and open to anyone should you or Alex wish to attend and discuss anything here.
Hey, thanks for the invite! I saw that these meetings were on the calendar, but wasn't sure whether anyone would have questions about all this if I showed up. I'm traveling next week, but I think I might be able to make the meeting anyway. Even if nobody has any questions about this packaging stuff I'll come and say hello.
Okay, so I spent some time fiddling about getting wheels to build on ubuntu:22.04, and I'm getting close but I'm still not exactly sure how long it could take. There are still some difficulties getting cmake to find dependencies :upside_down_face: so if sdists are acceptable for now, I would like to break off support for building wheels into a separate branch to be added in a later PR. For now, I'll mark this as ready for review.
Looks like the DCO workflow has failed, but I've done git commit -s on the only commit on this branch. What am I missing?
Looks like the DCO workflow has failed, but I've done
git commit -son the only commit on this branch. What am I missing?
Just glancing through the error message (at the bottom):
Expected "pdmurray [email protected]", but got "Peyton Murray [email protected]".
Great work you're doing there, pip support is a much awaited relief to the (in the past) tedious installation process of openvdb and especially pyopenvdb with its boost dependencies.
In addressing comments from @Idclip today I found that the build is broken, which must have happened some time since I last touched this project. Once again it looks like FindPython-related issues with cmake:
...
-- ----------------------------------------------------
-- ------------ Configuring OpenVDBPython -------------
-- ----------------------------------------------------
-- Could NOT find Python (missing: Development.Module Interpreter NumPy)
CMake Error at CMakeLists.txt:65 (message):
Could NOT find Python::Module (Required is at least version "")
Call Stack (most recent call first):
CMakeLists.txt:128 (openvdb_check_python_version)
...
Looking deeper, I tried printing the relevant FindPython-related variables with
include(CMakePrintHelpers)
cmake_print_variables(Python3_LIBRARY Python3_EXECUTABLE Python3_INCLUDE_DIR)
which showed that the correct values of those variables is being set:
-- Python3_LIBRARY="/home/pdmurray/.pyenv/versions/mambaforge/lib/libpython3.9.so" ; Python3_EXECUTABLE="/home/pdmurray/.pyenv/versions/mambaforge/bin/python3.9" ; Python3_INCLUDE_DIR="/home/pdmurray/.pyenv/versions/mambaforge/include/python3.9"
Anyway, I'll keep working on this to try to figure out what the issue is here.
Okay, after looking more closely at this I've found the issue:
# openvdb/openvdb/openvdb/python/CMakeLists.txt
...
# @note explicitly only search for Development.Module from 3.18 as searching
# Development.Embed can cause issues on linux systems where it doesn't exist
if(${CMAKE_VERSION} VERSION_LESS 3.18)
set(OPENVDB_PYTHON_REQUIRED_COMPONENTS Development)
else()
set(OPENVDB_PYTHON_REQUIRED_COMPONENTS Development.Module)
endif()
I'm building with cmake 3.23.2 on my local machine, but it fails unless I force it to search for Development rather than Development.Module. @Idclip How would you like to proceed here? Checking the git log it looks like you might have run across some of these issues in the past, do you have any advice?
UPDATE Okay, I just tried building in a fresh VM and it seems that this issue also appears there, so it's not just something weird with my development environment. I'm happy to take this block out and force FindPython to search for Development instead of Development.Module, but I think it might be a good idea to get some perspective from @Idclip first.
@peytondmurray I'm pretty confident that searching for Development.Module only is the right thing to do. I'm building with CMake 3.22.2 and have no issues. The fact that you're getting Could NOT find Python (missing: Development.Module Interpreter NumPy) to me implies something is wrong with your Python installation?
Any updates on this PR? It would great to have an easy way to install openvdb python bindings.
@Andrej730 I'm really sorry to have left this in a partially completed state for so long, so let me give a status update with some information:
- I reached out to the owner of the
pyopenvdbpackage on PyPI, but have not heard back. If we are going to publish a package on PyPI for this, it feels like it could be useful to choose another name to publish it under; perhaps justopenvdb? - It feels like a good time to start this work back up. I know there have been some discussions around using
pybind11and it would be useful to have CI set up for if/when that work moves forward.
Let me see if I can rebase onto the latest version and rebuild the python package. I'll spend a bit of time updating this PR, as I think a few of the github actions are now a bit out of date.
- I reached out to the owner of the
pyopenvdbpackage on PyPI, but have not heard back. If we are going to publish a package on PyPI for this, it feels like it could be useful to choose another name to publish it under; perhaps justopenvdb?
If it would be possible to publish it as just openvdb it would be like the best outcome since that's how people would people usually go looking for the package after seing import openvdb somewhere in code.
Agreed, but it's really up to the maintainers. If I remember correctly the current python bindings are still being built to pyopenvdb, but it would be a really simple change to make this work. @Idclip thoughts?
Hi all, after a discussion with the maintainers at today's meeting, we have agreed to publish the python bindings under the openvdb name. However, this effort is currently blocked pending the merge of #1515. Once that PR is merged, this PR will be reworked to add the automation needed to build and publish the python bindings, which will no longer be dependent on Boost.Python :relieved: Thank you for your patience while this gets sorted out!
PR #1515 moved to #1571
Thanks @kmuseth and everyone else for the pybind11 effort, I'll rework this PR to get the build automation in place!
Well, the upshot of this taking so long is that the pybind11/cmake/scikit-build ecosystem has become significantly simpler to make use of since I was last working on this. There's now a scikit-build-core build backend that is used in a great new pybind11 example that made this straightforward. Thanks to all who are working on making those projects so well documented and easy to use!
I tested and was able to publish an sdist on test.pypi.org, and was able to install it with pip install. However, I kept getting a segfault whenever I tried to import the bindings. @Idclip can you reproduce? I'm still not sure if this is a problem with the way I've configured the build system or whether this is some other issue out of scope for this PR.
Currently blocked by https://github.com/AcademySoftwareFoundation/openvdb/issues/1583.
Oh, and I forgot to mention that it would be great if the package's metadata included some urls.
@JeanChristopheMorinPerso Thanks for taking a look at this. URLs have been added.
About building precompiled binaries: you can read through the comment history above, but the reason I didn't do this before was effectively because of the dependency on Boost.Python; openvdb required a relatively newer version which wasn't available on the CentOS 7 (?) docker images that everyone uses for manylinux builds. This resulted in dependency hell trying to get the right version of boost with the right version of gcc, and everything needed to be compiled which ran up against CI runner timeout limits, and so on and so on.
The upshot of dropping Boost.Python as a dependency and swapping to pybind11 is that we don't have to worry about this. I'll get to work building this with cibuildwheel, which should make the job pretty simple.
I'm also seeing the issue with the include you saw above. This wasn't happening last time I worked on this, so I assume it's part of the growing pains associated with adding pybind11 support. I don't think it's related to the build system changes introduced here.
Thanks again for the input :rocket:
Great @peytondmurray! I wasn't sure if boost was used for something else than the Python bindings (previously). It's good to know it was only used for that and that it's no more required.
Feel free to poke me when you'll have wheels built by GitHub Actions. I'll happily review the changes and the produced wheels on all platforms (I usually download the wheels and inspect the binaries manually, and then install on all platforms in different scenarios to see if everything holds up together).
@JeanChristopheMorinPerso as expected the pybind11 dependency makes everything way easier to set up with cibuildwheel. Right now I'm blocked on the build failure you saw above, but I'll also see if I can't find a fix for that in a little bit - then we should be able to merge this.
Hi. Has there been any progress on this issue?
I've looked into it a bit, and the problem I see is that you're trying to build the openvdb libs + bindings directly using the openvdb/openvdb/python/CMakeLists.txt as the entry point in the GitHub actions script. However, when reaching this line of the build script, openvdb as a CMake target is not yet defined. That's why one should always link against namepaced targets in CMake. Indeed, in this case, CMake did not find any target named openvdb in scope, so it will simply add -lopenvdb to the link flags for the _openvdb target and fail at link time. Here we fail sooner because the transitive include directories for the openvdb target are not passed down to the client lib _openvdb. Again this is not a problem when building the bindings from the root CMakeLists.txt, because by the time we reach this line openvdb is a well-defined CMake target, and will bring its transitive list of include directories necessary for compilation.
So my recommendation would be twofolds:
- Use an namespaced ALIAS target here, e.g.
OpenVDB::OpenVDB, and replace all usages ofopenvdbwith this new namespaced target. - Add the following code snippet in this line:
This will allow usingif(NOT TARGET openvdb) add_subdirectory(../../.. openvdb_build) endif()openvdb/openvdb/python/CMakeLists.txtas the entry point for the build script.
This still doesn't work for me, as it's looking for BLOSC and doesn't find it, but I'm guessing by disabling the corresponding options you could get it to work eventually.
Hope this helps!
@peytondmurray The CI is now complaining because the commit is not signed off. Also I think a maintainer needs to approve the workflow for the build process to be triggered...
@jdumas I meant to post this yesterday, but thanks for looking into this. I committed your suggested changes yesterday, then tried to build the python bindings. It was taking too long on my laptop, so I pushed the changes with the intent to build when I got home, but never got around to taking care of this. I'll test it now, then sign the commit in a few minutes.
I'm getting a link error, but I'm guessing this is something to do with my system though and not what we've done here:
I was able to build it on my machine with the following changes to your branch:
diff --git i/openvdb/openvdb/python/pyproject.toml w/openvdb/openvdb/python/pyproject.toml
index 9876bd04..bd8bc50b 100644
--- i/openvdb/openvdb/python/pyproject.toml
+++ w/openvdb/openvdb/python/pyproject.toml
@@ -40,3 +40,4 @@ cmake.minimum-version = "3.18"
[tool.scikit-build.cmake.define]
OPENVDB_BUILD_CORE = "ON"
USE_NUMPY = "ON"
+USE_BLOSC = "OFF"
diff --git i/openvdb_cmd/CMakeLists.txt w/openvdb_cmd/CMakeLists.txt
index c4351f49..f5cd7cda 100644
--- i/openvdb_cmd/CMakeLists.txt
+++ w/openvdb_cmd/CMakeLists.txt
@@ -42,7 +42,7 @@ if(NOT OPENVDB_BUILD_CORE)
# both cases
set(OPENVDB_LIB OpenVDB::openvdb)
else()
- set(OPENVDB_LIB openvdb)
+ set(OPENVDB_LIB OpenVDB::OpenVDB)
endif()
set(OPENVDB_BINARIES_DEPENDENT_LIBS
Yep, I think I just missed a few places which needed the new alias (I didn't need to fiddle the BLOSC flag). I was able to successfully build it, so I've signed off now and pushed the new version.
@peytondmurray let me know when you are ready for another code review.
Thinking about this PR again I think we might want to use the alias name OpenVDB::openvdb to be consistent with the result of the FindOpenVDB.cmake provided in this repository. Although ideally we shouldn't write any FindFoo.cmake by hand, and we should instead rely on exported FooConfig.cmake generated by CMake, such a change would be beyond the scope of this PR.
It'd be great if we could:
- Update the alias name from
OpenVDB::OpenVDBtoOpenVDB::openvdbto be consistent with the currentFindOpenVDB.cmake. - Have a maintainer approve the workflow to make sure it works properly.
I'd like to help move the needle on this, so please let me know if there are other ways I can help.
@jdumas Thanks for following up about this. Following your suggestion, I've changed the alias to OpenVDB::openvdb. Now we need to start running the CI to make sure this works. @JeanChristopheMorinPerso could you please trigger a workflow run to see if cibuildwheel succeeds?