Nanobind build
Description
I pulled out the nanobind build stuff from https://github.com/TomographicImaging/CIL/pull/2105 into a separate PR. I haven't exactly figured out how change Wrappers/Python/CMakeLists.txt to install it, however. Regardless pulling this out separately is hopefully helpful :)
cc @casperdcl
Some notes:
I can rebase this on top of https://github.com/TomographicImaging/CIL/pull/2145 once it's ready and have it work pretty easily. However, it will probably require a folder name change to satisfy scikit-build-core requirements, which I would rather do by itself in a separate PR so that conflicts with other PRs aren't so big.
Additionally, the cpp code can likely simplified somewhat drastically. The interface right now accepts numpy arrays directly, and to make it simple, I just get the underlying pointer to the arrays instead of modifying the code to use whatever nice methods the numpy array types have, since the previous code directly operated on the pointers.
thanks @purepani; I can fix this after #2148 & #2145.
Just a quick note for now - could you use pybind11 in lieu of nanobind? In future we will need some functionality from the former not available in the latter. Ideally also use pybind11::buffer instead of numpy-specific pybind11::array_t
Yeah I can try doing those
Rebased on https://github.com/TomographicImaging/CIL/pull/2145, got it at least building a wheel on my computer. Should be pretty simple to switch this to pybind11, but I'll work on that later this week.
Also added a commit to move the folder src/Core to src/cil, as scikit-build-core likes the last part of the path to be the same as the library name. As mentioned earlier, this will be better to put in a separate PR to cause the least amount of rebase pain.
I didn't make any attempt to clean up the code yet. It's just been rebased
Switched to pybind11, though haven't cleaned up commits at all.
I used pybind11::array_t for now since it looked a bit easier to switch to. I'm not great at cpp and kinda fumble around in the language, so that might be good for a follow up PR; additionally it looks like it might require a wrapper to allow for mutability, which to me feels better for a follow up.
In particular, the buffer protocol only has a .data() output, and doesn't seem to have templating, so as far as I can tell, it there's no way to directly tell the object to output to a const value vs a normal mutable value; array_t has this in the form of mutable_data() and data(), whereas nanobind allows this by templating the type as nb:array<const float> vs nb::array<float>.
Have no clue why the conda build is failing, but I'll fix that later Feels quite good to have a net negative amount of line changes(though that's mostly from #2145 I believe) :smile:
Note that this updates IPP and removes the need for the fallback IPP introduced in the other pr
What are the reasons you want to house pybind11 in particular? @casperdcl
Nanobind has support for the buffer protocol as well(through the ndarray type I was previously using), and also the newer DLPack(which is also supported by the array api), so it seems like a more forward-looking option in that sense.
(See https://nanobind.readthedocs.io/en/latest/ndarray.html)
As per #2132 (and #1912 #1126 #572 etc.) we're looking to support arbitrary (non-numpy) arrays, so buffer protocol is safest.
nanobind is a stripped-down (quicker to compile) fork of pybind11 and thus lacks a lot of useful features. Our codebase is never going to grow large enough that compilation time matters.
You are right that some basic checks must be done manually, e.g.
if (info.format != pybind11::format_descriptor<float>::format())
throw std::runtime_error("expected float32");
but tbh even I don't bother[^py-err]
[^py-err]: I removed C++ type checking in AMYPAD/NumCu@0a6da93 because it's much neater to do the checks & pre-alloc output in Python
I certainly don't mind doing it either way(since, well, I already did both 😊), but if the main reason is the buffer protocol, then the nanobind code I had should already have worked for that, as well as supporting the DLPack protocol, which is a more modern version of the buffer protocol(i.e. supporting GPU data-backed arrays). Particularly, pybind11 doesn't support DLPack so this is a strict advantage to using nanobind(and a significant one if you want to consider adding some GPU support).
Certainly nanobind is less featureful in general, so if there are other features that matter a lot, then it makes more sense to go with pybind11. It just sounded like the buffer protocol was one of the important reasons, so I wanted to clarify nanobind is better than pybind11 in that regard(and arguably with cleaner code since it's a single type for both DLPack and the buffer protocol).
I'm uncertain; have you tried casting a custom (non-numpy/cupy etc) object (such as the pybind11::class_<Matrix> example) to a nanobind::array? Last time I checked it didn't work.
Just tested in https://github.com/AMYPAD/NumCu/pull/9 and it looks like nanobind works but needs:
python>=3.8- no
-Wpedantic
Not sure what's up with the -Wpedantic; that seems a bit strange.
The >=3.8 requirement is definitely not a big deal for this project given that it already requires 3.10(and 3.8 is already out of security support anyway).
I've been trying to remove the requirement to have dependency list duplicated in recipe/meta.yaml, but I can't get it to work with pip install ..
However, I am able to get it to work with uv pip install . using uv.
I have no idea why the normal pip install . doesn't work for doing this, but I suppose there's no harm in just leaving that to a separate PR(or just ignore the problem until the recipe folder is gone).
@casperdcl How are you getting IPP through conda in #2145? The intel conda repo doesn't seem to be declared in CI over there. Is it mirrored through the tomographic imaging conda repo?
I'm asking just to update IPP so the FindIPP.cmake file can be removed(I think there was some reason I needed to do that, but I forgot what it was)
I updated the code to use the uv pip install . method, getting the dependencies from pyproject, to get the tests passing.
I will go ahead and mark this as ready for review(though the above is free to change).
I went back to nanobind for now, but I still have the pybind commit and can add that easily if preferred. Additionally, I pinned numpy to <2.0 since the tests did not pass otherwise(the binary abi is different between the versions).
Rebased
Not sure why the conda test failed. I'll look at either once I have more time, or this is closer to getting merged.
Rebased on master since #2145 is merged!
@casperdcl Do you think it's better to move the folder src/Core to src/cil(which is done in this PR), or to rename the compiled library to Core? It's required by scikit-build-core to have the last component of the path name match the library name.
Actually, there's one part of this I think I can break off into a separate PR(Updating IPP, and removing the manual IPP discovery workaround). I can maybe break that out into a different PR but that would take more time than I have at this moment.
test_download_data_empty, test_download_data_input_n, and test_download_data_input_y are all failing due to
AttributeError: <module 'zenodo_get' from '/usr/share/miniconda/envs/test/conda-bld/cil_1749565646173/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac/lib/python3.12/site-packages/zenodo_get/__init__.py'> does not have the attribute 'zenodo_get'
I don't have the ability to investigate this at the moment; is there any obvious reason why that attribute doesn't exist? If not, I'll figure it out later. Might be due to conda and the uv pip install conflicting?(I'll try --no-deps)
Ok that was clearly it. Feel free to review!
(I can try removing uv but I'll leave it for now so that a review can be done).
As a note, I believe this statically links libraries, so related to https://github.com/TomographicImaging/CIL/issues/2154
Also as an alternative to renaming the folder, we could rename the library Core, but I feel that that name is too unclear.
@casperdcl should I be keeping this up with the main branch, or will it be a while before this may get looked at a bit closer?