mpich icon indicating copy to clipboard operation
mpich copied to clipboard

ABI: Fortran binding on top of core C ABI

Open hzhou opened this issue 1 year ago • 7 comments

Pull Request Description

This is an experimental draft, (very) incomplete

Author Checklist

  • [ ] Provide Description Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • [ ] Commits Follow Good Practice Commits are self-contained and do not do two things at once. Commit message is of the form: module: short description Commit message explains what's in the commit.
  • [ ] Passes All Tests Whitespace checker. Warnings test. Additional tests via comments.
  • [ ] Contribution Agreement For non-Argonne authors, check contribution agreement. If necessary, request an explicit comment from your companies PR approval manager.

hzhou avatar Mar 29 '24 16:03 hzhou

@dalcinl This what I am experimenting.

hzhou avatar Mar 29 '24 16:03 hzhou

Current ABI draft -

  • All fortran interop C API are included in MPI C ABI, e.g. MPI_INTEGER, MPI_Comm_f2c, MPI_F_STATUS_IGNORE
  • Jeff intended an MPI ABI implementation that does not depend on Fortran
  • Thus the ABI standard need fix all relevant Fortran ABIs
  • The key requirement: Fortran INTEGER is equivalent to C int, but also need fix all Fortran datatypes with C equivalents.
  • MPI C ABI need implement MPI_Comm_f2c etc. with a dictionary mechanism
  • MPI C ABI does not do any Fortran compiler checks, only via assumptions

My rejection:

  • MPI should not specifiy INTEGER is C int if Fortran doesn't say so
  • MPI Fortran binding should implement proper MPI Fortran inter-op, via compiler feature checks and value conversions if necessary.
  • A C-only MPI Fortran inter-op is useless. For example, to inter-op with Fortran INTEGER, just use MPI_Int.

My proposal:

  • Do not change any MPI API
  • MPI C ABI does not contain any Fortran parts
    • mpi_abi.h only contains non-fortran APIs
    • mpi.h includes mpi_abi.h + mpi_fort.h
    • libmpi_abi.so does not contain any Fortran related symbols
    • libmpifort.so contains or Fortran related symbols include Fortran interface and inter-op C APIs
  • Fortran related API is provided in header via an opt-in option and in linker via libmpifort.so

Examples

case 1: t.c is a C-only MPI code

$ mpicc -mpi-abi -o t t.c
# t will link with `libmpi_abi.so`

case 2: t.c is a C code that uses MPI_INTEGER (or MPI_Comm_f2c)

$ mpicc -mpi-abi -o t t.c
ERROR: name undefined

$ mpicc -mpi-abi -fort -o t t.c
# SUCCESS. t will link with `libmpifort.so libmpi_abi.so`

case 3: t.f is a FORTRAN code

$ mpifort -mpi-abi -o t t.f
# SUCCESS. t will link with `libmpifort.so libmpi_abi.so`

hzhou avatar Mar 29 '24 16:03 hzhou

IIUC, your proposal makes the API of the ABI a subset of the full MPI API. Now code that uses the full API will not be able to swap implementations at runtime, just because you cannot constraint MPI_Fint to be int, which is the case that the vast majority of MPI users face in practice, all in name of purity or because some guys may want to compile code with -i8 to have 64bit INTEGER. I really hope we can find an alternative, like allowing ABIs that could use a different MPI_Fint size.

If Fortran interoperability cannot be achieved exclusively through the C MPI ABI library, then mpi4py will simply stop supporting any kind of Fortran interoperability, and I'll blame the Forum decision for such short-sighted view of the software ecosystem. The C programming language should be considered lingua franca for language interoperability: Python/Julia/Rust/Java all have established and robust mechanisms tp interoperate well with C, and Fortran interoperate with C with its own mechanism, then it is obvious that C can be the easy bridge between languages.

dalcinl avatar Mar 29 '24 17:03 dalcinl

If Fortran interoperability cannot be achieved exclusively through the C MPI ABI library, then mpi4py will simply stop supporting any kind of Fortran interoperability,

Seems irrational. The libmpi_abi.so portion will be ABI swappable, the libmpifort.so will be binding implementation dependent. Distribution can easily swap one or another respectively. The latter is not ABI swappable and will cause issues, just as today. You can circumvent it as today via dl_open -- only if users need Fortran inter-op. I think there is much fewer reasons to swap fortran binding library. So it won't be too big a issue for long.

hzhou avatar Mar 29 '24 19:03 hzhou

Notes so far:

  • Need a set of handles _{to,from}_int conversion functions. It is easier to implement in the core because the core has access to the internal object fields and controls the lifetime of the objects.
  • User callbacks need a context field for language proxies to track the actual user functions to call. The attribute copy/del callbacks and Grequest callbacks have the extra_state parameter that bindings can use, but the user Op function and Error handler function are missing it.
  • Need corresponding destruction callback to let bindings or users to free the context. Otherwise it leaks at the end of application that Valgrind etc. will complain.
  • It would be nice to have a mechanism to let bindings register init/finalize hooks. I think it is essentially the same technical issues that QMPI proposal need deal with.

hzhou avatar Apr 03 '24 22:04 hzhou

  • User callbacks need a context field for language proxies
  • User callbacks need a context field for language proxies

This rationale itches me a bit... This stuff is not only useful for language bindings. Third party quality libraries making advanced used of MPI can definitely take advantage all these features to handle state and lifetime of things the proper and easy way.

  • It would be nice to have a mechanism to let bindings register init/finalize hooks.

You should clarify whether these hooks are collective or not on invocation. You most likely want collective. Finalize hooks are somewhat supported via silly attributes set in MPI_COMM_SELF, but my understanding is that these are not collective.

dalcinl avatar Apr 04 '24 07:04 dalcinl

Seems irrational.

Yes, it may seem irrational, but only until the day you have to deal with the many issues you may face by introducing binary dependencies (either statically embedded in libraries at link time or dynamically linked via dlopen()).

Look, I'm really serious about avoiding hard-dependencies. For example, 99% of mpi4py users also use NumPy. Making mpi4py depend on NumPy would allow me to simplify a lot of stuff and cleanup my code. NumPy have well established API/ABI mechanisms that minimize the dependency problem when upgrading versions, so that's another point in favor. Yet, I refuse for mpi4py to explicitly depend on NumPy. Hard dependencies eventually produce pain points downstream.

Fortran support is baggage MPI still has to deal with. In the mid to long term I'm totally in favor of moving away from old bad practice. But that should not happen at the expense of pain and mess pushed on users. Current ABI proposal may already generate controversy as it is. What you are trying to do with your Fortran changes adds a lot of mess on top, and IMHO you are jeopardizing the current effort. Is not about the "what", it is about the "when".

dalcinl avatar Apr 04 '24 07:04 dalcinl