dolfinx icon indicating copy to clipboard operation
dolfinx copied to clipboard

Build C++ part on Windows

Open jhale opened this issue 1 year ago • 4 comments

This builds the C++ part of DOLFINx on Windows against vcpkg using an overlay package for Intel MPI (MSMPI does not support neighbourhood collective operations).

Requires https://github.com/FEniCS/ffcx/pull/694

Update: C++ library + tests build, C++ tests pass (thanks to @chrisrichardson for local debugging), Python interface builds (with loguru calls removed), all Python tests run in serial + parallel, all platform CIs now run.

jhale avatar May 05 '24 14:05 jhale

@minrk What's the status of Intel MPI in Conda? It seems to be the only MPI3 compliant build of MPI on Windows - MSMPI doesn't have neighbourhood operations, which we use extensively.

jhale avatar May 06 '24 07:05 jhale

@minrk What's the status of Intel MPI in Conda? It seems to be the only MPI3 compliant build of MPI on Windows - MSMPI doesn't have neighbourhood operations, which we use extensively.

The support in MSMPI seems strange, as you can create distributed communicators, but not use alltoall on them:

  • https://learn.microsoft.com/en-us/message-passing-interface/mpi-dist-graph-create-function
  • https://learn.microsoft.com/en-us/message-passing-interface/mpi-collective-functions

jorgensd avatar May 06 '24 07:05 jorgensd

I've never interacted with it personally so I don't know the details or pitfalls, but at least mpi4py is built with intel mpi on windows and linux. So it should be an option, at least.

minrk avatar May 06 '24 13:05 minrk

@minrk What's the status of Intel MPI in Conda? It seems to be the only MPI3 compliant build of MPI on Windows - MSMPI doesn't have neighbourhood operations, which we use extensively.

The support in MSMPI seems strange, as you can create distributed communicators, but not use alltoall on them:

  • https://learn.microsoft.com/en-us/message-passing-interface/mpi-dist-graph-create-function
  • https://learn.microsoft.com/en-us/message-passing-interface/mpi-collective-functions

https://github.com/microsoft/Microsoft-MPI - This clears it up. MSMPI is based on an old MPICH and is 2.2 compliant, and implements some features of 3.1 (not the critical neighbourhood alltoall we need).

It seems that there is no open source and modern standards compliant MPI for Windows. We'll have to stick with Intel MPI for the proposed Conda builds, I don't think there is any appetite to support MPI 2.2 and MPI 3.1 simultaneously.

jhale avatar May 06 '24 15:05 jhale

@minrk This is probably now ready to test on Windows. I took a quick look and it seemed that two important dependencies (HDF5 + Parmetis) could be missing Intel MPI builds on Windows before this can really be tested.

jhale avatar May 22 '24 15:05 jhale

With ffcx working on conda, I'm looking at dolfinx now. An impediment to getting this in conda-forge is that neither parmetis nor hdf5 + mpi is currently packaged on Windows. ptscotch is on Windows, though, but only built with msmpi and I seem to recall you needed intel-mpi and not msmpi.

Current conda-forge dependency status on Windows:

  • adios2 - no mpi
  • kahip - no
  • hdf5 - no mpi
  • parmetis - no
  • ptscotch - msmpi
  • petsc/slepc - lol, no

I can work on some of this, but it would help to know what to prioritize. Looking at this PR, it seems parmetis + hdf5 on impi is sufficient. Is that right? hdf5 with mpi is definitely required?

minrk avatar May 22 '24 17:05 minrk

Thanks for the summary. HDF5 and a partitioner are mandatory; Parmetis, and if Parmetis proves difficult, KaHiP instead. Don't worry about the others (ADIOS2 would be the next thing to include, but no worries for a first release).

Edit: Parmetis builds on Windows + Intel MPI within vcpkg.

jhale avatar May 22 '24 17:05 jhale

Yeah, I've been looking at the vcpkg parmetis recipe. Unfortunately, it requires quite a bit of patching to build on Windows, and is based on an unreleased version of parmetis, not 4.0.3, which was the last release quite a long time ago. But perhaps we could port those patches. I don't know what to do about being based on an unreleased version, though.

KaHIP also doesn't support Windows, and scotch technically doesn't either, but someone is maintaining patches that work for msmpi, at least.

So there are really no parallel partitioners that I know of that support Windows, but since there is already ptscotch with msmpi, I figured that might be the easiest one: https://github.com/conda-forge/scotch-feedstock/pull/83

minrk avatar May 22 '24 18:05 minrk

Ptscotch would be also ok. Note that we don't need Python KaHiP, just the C++ part.

jhale avatar May 22 '24 18:05 jhale

So there are really no parallel partitioners that I know of that support Windows, but since there is already ptscotch with msmpi, I figured that might be the easiest one: conda-forge/scotch-feedstock#83

What about METIS and Windows? We're not targeting HPC with Windows, so we could use METIS in DOLFINx if METIS supports Windows.

garth-wells avatar May 23 '24 13:05 garth-wells

metis is indeed packaged on Windows.

It looks like it might not be super hard. ptscotch builds with impi are done (https://github.com/conda-forge/scotch-feedstock/pull/83), and hdf5 may be working soon: https://github.com/conda-forge/hdf5-feedstock/pull/218

minrk avatar May 23 '24 13:05 minrk

Well, I'm impressed and a little surprised! Great work @minrk

jhale avatar May 23 '24 15:05 jhale

conda-forge update: hdf5 and ptscotch now have impi builds, so I’ve started looking at dolfinx.

My first snag was that I couldn't get FindSCOTCH to work on Windows (the test wasn't linking MPI correctly, despite MPI being found just fine). But since scotch 7 itself uses cmake, switching from find_package(SCOTCH) to find_package(SCOTCH CONFIG) worked fine. I'm not sufficiently confident in CMake to make a suggestion for how best to handle the fact that regular find_package(SCOTCH) without custom FindSCOTCH may work and might be preferred, but only with SCOTCH >= 7. It seems reasonable to require cmake scotch on Windows and not try to figure out FindSCOTCH.

Right now, compilation of libdolfinx completes, but linking hdf5 fails with some missing symbols, which is weird because hdf5.lib is definitely linked, and dumpbin /exports hdf5.lib shows that all the missing symbols are indeed defined, and hdf5.lib is being linked.

minrk avatar May 30 '24 12:05 minrk

cpp builds, now figuring out why python can't find MPI_C with the same cmake args that work for cpp.

minrk avatar May 30 '24 20:05 minrk

I wanted to point out the Cmake option to disable Basix being hinted using Python, which is probably useful to make your libdolfinx build Python free.

We can try removing findscotch, config mode is always preferred.

jhale avatar May 31 '24 05:05 jhale

We could try first for config mode, then findscotch. PETSc isn't building PTSCOTCH using CMake, in which case a PTSCOTCH CMake config file is not installed.

garth-wells avatar May 31 '24 07:05 garth-wells

so something like putting find_package(SCOTCH CONFIG QUIET) in the beginning of FindSCOTCH.cmake, and proceed with the rest if(NOT SCOTCH_FOUND)?

minrk avatar May 31 '24 13:05 minrk

Put it in the top-level DOLFINx CMakeLists.txt. I can have a stab shortly, but not sure I'll be able to test it.

garth-wells avatar May 31 '24 13:05 garth-wells

C++ and Python builds are working in conda. Folks should be able to try them out with:

conda install -c minrk/label/fenics-windows -c conda-forge fenics-dolfinx=0.9.0.dev

I've no idea how it's going to do with finding compilers on a real user system as opposed to the conda-forge CI, but I would be a bit surprised if it works by default without being launched in a session with vcvarsall.bat or similar.

All C++ tests are passing, and almost all Python tests are passing. All failing python tests appear to be attributable to missing skips for the absence of adios2 and petsc.

I used this branch, which is just this PR plus #3241 and a commit to remove the os.add_dll_directory which prevents import when the hardcoded directories do not exist.

FWIW, tests are run with ufl 2024.1 because I haven't packaged 2024.2 dev, and everything passes. That suggests to me that the lower bound on ufl has been bumped prematurely.

minrk avatar Jun 01 '24 11:06 minrk

Forgot to mention, this warning came up:

libffcx_forms_1b9958e91fd54e9c55f60346c28378fb63cad26e.c(722): warning C4305: 'initializing': truncation from 'double' to 'const float'

104,127 times when running the tests. I'm not sure if that's something that should be fixed in ffcx.

compiler warnings summary from running the tests with pytest -vs:

# grep -E -o 'warning C.+' dolfinx-win64.txt | sort | uniq -c
1665 warning C4113: 'ufcx_tabulate_tensor_float32 (__cdecl **)' differs in parameter lists from 'void (__cdecl **)(float *,const float *,const float *,const float *,const int *,const uint8_t *)'
1665 warning C4113: 'ufcx_tabulate_tensor_float64 (__cdecl **)' differs in parameter lists from 'void (__cdecl **)(double *,const double *,const double *,const double *,const int *,const uint8_t *)'
 165 warning C4244: 'initializing': conversion from 'double' to 'float', possible loss of data
104127 warning C4305: 'initializing': truncation from 'double' to 'const float'

minrk avatar Jun 01 '24 11:06 minrk

Thanks for the detailed feedback! We will look at the points you've raised, tidy this up and come back for another conda build.

In terms of testing on a user system, how can we get your builds onto a Windows desktop with conda setup? We have a Windows desktop for testing.

jhale avatar Jun 01 '24 15:06 jhale

The command I gave above should be able to get it for anyone with Conda on windows:

conda install -c minrk/label/fenics-windows -c conda-forge fenics-dolfinx=0.9.0.dev

minrk avatar Jun 01 '24 17:06 minrk

Once this runs green it can be re-built on conda-forge. The basix branch should now be unnecessary, I removed the DLL stuff in favour of a proper solution.

jhale avatar Jun 03 '24 14:06 jhale