dolfinx icon indicating copy to clipboard operation
dolfinx copied to clipboard

Replace `nb::any` with -1 and add argument to `nb::ndarray` in preparation of next nanobind release

Open francesco-ballarin opened this issue 1 year ago • 5 comments

  • The first commit addresses https://github.com/wjakob/nanobind/commit/22185f9e40c6938cb87690a1de79f82437f4e954#diff-b7f8a8784d5eff6e6b1b2c2529b831cd86132c49699fe48e0d23c17ad4e2d7de, which is a backward incompatible change in nanobind (nb::any changed meaning)
  • The second commit adresses https://github.com/wjakob/nanobind/commit/937a1df52428f04a0a240cc5827fb3c6769cf5c2 (a default argument was dropped in the nb::ndarray constructor)

This is not meant to be merged yet, because it will need to wait for a new nanobind release that includes that commit. Furthermore, it will need to be preceded by :

  1. a commit/PR that bumps the minimum required nanobind version in pyproject.toml because, as shown by the failure https://github.com/FEniCS/dolfinx/actions/runs/8308122378/job/22738016168 (nanobind current release, dolfinx from this PR) and announced by the wording in the upstream commit, the change is backward incompatible
  2. a commit/PR that bumps the nanobind version in docker images.
  3. dropping workflow changes.

Requires https://github.com/FEniCS/basix/pull/799

francesco-ballarin avatar Mar 16 '24 13:03 francesco-ballarin

Is this merged in nanobind release yet? If not I would propose doing a backport at a later date.

jhale avatar Apr 12 '24 10:04 jhale

It's not in any nanobind release yet, we will have to backport it after 0.8.0.

francesco-ballarin avatar Apr 12 '24 13:04 francesco-ballarin

The fourth commit, to be dropped before merging this PR, upgrades nanobind from source. nanobind installation happens before basix is installed, to ensure that basix and dolfinx wrappers use a matching nanobind version.

Will need to figure out at a later stage what is going on with the Red Hat build: those are the same errors that I used to get when there was a mismatch between basix and dolfinx wrappers, for instance basix wrappers were compiled with nanobind release version and dolfinx wrappers were compiled with nanobind main.

francesco-ballarin avatar Apr 13 '24 07:04 francesco-ballarin

For redhat Basix will be built with build isolation (pip will download nanobind), dolfinx without (system nanobind).

jhale avatar Apr 13 '24 11:04 jhale

I think all of our dolfinx basix builds should be without isolation - upstream Basix should do isolated build testing.

jhale avatar Apr 13 '24 11:04 jhale

@garth-wells do revert aab5ba6 altogether now that nanobind 2.0 is out.

francesco-ballarin avatar May 23 '24 18:05 francesco-ballarin

I think all of our dolfinx basix builds should be without isolation - upstream Basix should do isolated build testing.

Can we backport this to the latest release? Currently I would have to pin nanobind based on the dolfinx tag I want to test with. Context: I have a Github action that installs DOLFINx from source for the nightly and current images. As we have not pinned nanobind on v0.8.x, I get into build issues with the current workflow: https://github.com/jorgensd/actions/blob/dokken/remove-pybind/install-dolfinx/action.yaml with the CI: https://github.com/jorgensd/actions/actions/runs/9264998235/job/25486173015?pr=8 and a part of the traceback:

  *** Building project with Ninja...
  [1/24] Building CXX object CMakeFiles/cpp.dir/dolfinx/wrappers/dolfinx.cpp.o
  [2/24] Building CXX object CMakeFiles/cpp.dir/dolfinx/wrappers/graph.cpp.o
  FAILED: CMakeFiles/cpp.dir/dolfinx/wrappers/graph.cpp.o
  /usr/bin/c++ -DADIOS2_USE_MPI -DBOOST_TIMER_DYN_LINK -DBOOST_TIMER_NO_LIB -DDOLFINX_VERSION=\"0.8.0\" -DHAS_ADIOS2 -DHAS_PETSC -DHAS_PTSCOTCH -DHAS_SLEPC -Dcpp_EXPORTS -I/usr/local/lib/python3.12/dist-packages/petsc4py/include -I/usr/local/lib/python3.12/dist-packages/mpi4py/include -I/usr/include/python3.12 -I/usr/local/lib/python3.12/dist-packages/nanobind/include -isystem /usr/local/lib/python3.12/dist-packages/ffcx/codegeneration -isystem /usr/local/petsc/linux-gnu-complex64-32/include -isystem /usr/local/petsc/include -isystem /usr/local/slepc/linux-gnu-complex64-32/include -isystem /usr/local/slepc/include -O3 -DNDEBUG -std=c++20 -fPIC -fvisibility=hidden -fno-stack-protector -ffunction-sections -fdata-sections -MD -MT CMakeFiles/cpp.dir/dolfinx/wrappers/graph.cpp.o -MF CMakeFiles/cpp.dir/dolfinx/wrappers/graph.cpp.o.d -o CMakeFiles/cpp.dir/dolfinx/wrappers/graph.cpp.o -c /__w/actions/actions/src/dolfinx/python/dolfinx/wrappers/graph.cpp
  /__w/actions/actions/src/dolfinx/python/dolfinx/wrappers/graph.cpp: In instantiation of ‘void dolfinx_wrappers::declare_adjacency_list(nanobind::module_&, std::string) [with T = int; std::string = std::__cxx11::basic_string<char>]’:
  /__w/actions/actions/src/dolfinx/python/dolfinx/wrappers/graph.cpp:115:39:   required from here
  /__w/actions/actions/src/dolfinx/python/dolfinx/wrappers/graph.cpp:85:24: error: no matching function for call to ‘nanobind::ndarray<const int, nanobind::numpy>::ndarray(std::span<const int>::pointer, <brace-enclosed initializer list>)’
     85 |             return nb::ndarray<const T, nb::numpy>(link.data(), {link.size()});
        |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

jorgensd avatar May 28 '24 07:05 jorgensd

The simplest course of action would be to backport a pin of nanobind.

garth-wells avatar May 28 '24 08:05 garth-wells

Hi.

this change: https://github.com/FEniCS/dolfinx/pull/3105/files#diff-1a45bd938160f7761769e7b6fbc819d1cc7f84ce37f1a6a23ef83cc4e214abc8

Breaks the compatibility with the version v0.8.0, in which the python binding can't be installed from source with the nanobind v2.0.0.

efirvida avatar May 29 '24 15:05 efirvida

@efirvida That's the other way around. This change guarantees that dolfinx main is installable with nanobind 2.0.

You'll have to downgrade your nanobind version if you are trying to install v0.8.0.

francesco-ballarin avatar May 29 '24 15:05 francesco-ballarin

@francesco-ballarin thanks for you answer, what I mean is that the released package of the v0.8.0 have this issue, so as it is a release I think this should be fixed on the package. without affecting the main branch.

In my case for example I'm using it on a HPC cluste with easybuild package manager. When I try to use the release version it was imposible until I found this thread, and I use an older nanobind. was a quick fix but it take a while until I discovered.

if it's helpful for the project I can share the easy configs.

efirvida avatar May 31 '24 16:05 efirvida

Indeed, this could be backported, or the alternative way discussed in https://github.com/FEniCS/dolfinx/pull/3105#issuecomment-2134637477 could be followed.

francesco-ballarin avatar May 31 '24 16:05 francesco-ballarin