conda-forge.github.io icon indicating copy to clipboard operation
conda-forge.github.io copied to clipboard

RFC: New compiler wrappers for handling C++ & Fortran modules

Open h-vetinari opened this issue 9 months ago • 14 comments

It's not every day that we need to modify something in our compiler setup, so I wanted to give this topic some more visibility.

Situation

Both Fortran and C++ have a concept of so-called modules, which are addressing (among other things) a desire to avoid recompiling the same code over and over, without going as far as putting the code into an explicit library.

Fortran introduced these in F90 already, whereas C++ only did so in C++20 (and famously, the transition is slow-going, as support by compilers and build orchestrators is still work-in-progress).

One key aspect of modules is that -- in contrast to a full-fledged library -- the modules can only be consumed by the same compiler that produced them. This constraint is missing from our metdata.

In https://github.com/conda-forge/flang-activation-feedstock/issues/14, I came up with the following chain of considerations (for Fortran, but the same applies to C++)

  1. Not every fortran project produces modules
  2. Fortran modules are not used (or usable) at runtime, only at buildtime of a dependent project
  3. Fortran modules depend on using the same compiler for the dependent project as the one that produced the module.
  4. As several recipes in conda-forge use modules, we should reflect this in our metadata, rather than let them fall into this trap.
  5. To avoid the constraint at runtime, it makes sense to put fortran modules for foo into a foo-devel package.
  6. We should attach constraints to such -devel packages to enforce the right compiler.
  7. Recipe authors should opt into such a constraint (because we have no way to detect it automatically).
  8. This constraint should be on activated compilers, of which we generally have only one per language in 99+% of recipes.

This should hopefully be uncontroversial.

How to fix this

The obvious first choice would be something that produces a run-export for packages being built with a certain compiler, say something like

outputs:
  - name: _produces_fortran_modules
    build:
      string: {{ compiler_fortran }}_{{ PKG_BUILDNUM }}
      run_exports:
        - _produces_fortran_modules * {{ compiler_fortran }}*
    requirements:
      run_constrained:
        - gfortran_{{ target_platform }} <0.0a0     # [compiler_fortran != "gfortran"]
        - flang_{{ target_platform }} <0.0a0        # [compiler_fortran != "flang"]
        - ifx_{{ target_platform }} <0.0a0          # [compiler_fortran != "ifx"]

(for a worked out example see https://github.com/conda-forge/staged-recipes/pull/30119)

The recipe author of the foo-devel package would have to add this to the host-dependencies of that output, and thus the output would gain the respective constraint. The problem is however, that there's no good way to make this constraint of foo-devel (in the host: environment) for building dependent package bar conflict with the fortran compiler (in the build: environment).

We could add a strong run-export to the compiler, but that's going too far -- it would also affect the runtime requirements of bar, which have nothing to do with fortran modules anymore, and so shouldn't constrain the compiler. This is the missing host_exports problem.

So given that limitation, I went back to the start and reflected that adding some _magic_package_with_an_underscore is pretty poor UX for the recipe authors anyway, and that we could solve the problem comprehensively by creating a new compiler key such that the user would then have

outputs:
  - name: foo-devel
    requirements:
      build:
        - {{ stdlib("c") }}
        - {{ compiler("fortran_modules") }}    # different key for selecting the compiler!
      host:
        - [...]

This works around the missing host-exports problem because we now can attach a strong run-export to the compilers that we provide under the fortran_modules: key (in contrast to the general-purpose compilers which must remain usable also for non-module usage). And because recipe authors won't need to add a magic package manually anymore, we can choose a better name, i.e. drop _produces.

This is pretty much the reasoning that @isuruf and I used in https://github.com/conda-forge/flang-activation-feedstock/issues/14 to figure out this solution, but I wanted to get some feedback on this -- from the overall approach down to the bikeshed of naming these things.

Another reason why making this a full-blown compiler sounds like a good idea to me is that our C++ toolchain will have exactly the same problem to solve once C++20 module usage picks up (and it's starting to), and therefore we should have a uniform approach across languages. This is also why I prefer the _modules suffix for clarity, although Isuru suggested fortran_mod (presumably because the fortran modules actually use a .mod extension).

In more detail, I would imagine to add the following pins to the global pinning (compiler suffix TBD, but it makes sense to me that this matches whatever we add to the pinning key):

cxx_modules_compiler:
  - gxx_modules                 # [linux]
  - clangxx_modules             # [osx]
  - vc_modules                  # [win]  -- modules support in vs2019 is poor; c.f. also #2138
cxx_modules_compiler_version:
  # matching cxx_compiler_version
fortran_modules_compiler:
  - gfortran_modules            # [unix]
  - flang_modules               # [win]
fortran_modules_compiler_version:
  # matching fortran_compiler_version

The compiler activation feedstocks (here demonstrated for gfortran) would then gain an extra output along the lines of

outputs:
  [...]
  - name: gfortran_modules_{{ cross_target_platform }}:
    build:
      run_exports:
        strong:
          - _fortran_modules * gxx*
    requirements:
      run:
        - {{ pin_subpackage("gfortran_" ~ cross_target_platform, exact=True) }}

CC @conda-forge/core

PS. Haha, got a nice issue number! Hopefully by 2525, modules will be working smoothly 😆

version-dependence of modules

It's even possible for modules to depend on the compiler version, c.f. the gfortran 15 release notes:

The Fortran module *.mod format generated by GCC 15 is incompatible with the module format generated by GCC 8 - 14, but GCC 15 can for compatibility still read GCC 8 - 14 created module files.

The approach taken with _fortran_modules and <compiler>_modules_<target> is flexible enough to encode some version constraints.

Maybe host_exports is necessary to solve this

(Edit: updated after some discussion below...)

Actually, the exercise with thinking through the version constraints exposed a potentially fatal flaw in my current proposal.

We always come back to:

The big problem is how the foo-devel modules package (where reasonably we'd have to attach the constraint) in host: could be made to conflict with something in build:.

And the only way to produce a conflict between build: and host: is a (strong) run-export. But that defeats the purpose, because we wanted to use regular {{ compiler("fortran") }} without a strong run-export for builds that are consuming modules. And in that case, the conflict just doesn't trigger...

So now I'm wondering if a proper solution to all this requires host_exports (which would provide the right tool, because then even the general-purpose Fortran compiler could host-export _fortran_modules * <compiler>*; i.e. forcing the right kind of modules in host: without producing any dependency change in run:).

h-vetinari avatar May 23 '25 06:05 h-vetinari

Thanks for dealing with this complex problem! Just to understand, a consequence of this is that once fortran and C++ modules start to be adopted, we would need to have some kind of compiler pinning and compiler migration for projects that use modules?

traversaro avatar May 23 '25 07:05 traversaro

once fortran and C++ modules start to be adopted, we would need to have some kind of compiler pinning and compiler migration for projects that use modules?

Good question!

The answer (at least based on the solution proposed above) is that you would only need to exchange {{ compiler("cxx") }} with {{ compiler("cxx_modules") }} in recipes that are producing modules (so no migration necessary). This imbues the resulting outputs with the constraints that ensure that when those modules get consumed by some dependent build, it's done with the right compiler.

This is an aspect that's a bit counter-intuitive. It would be more natural to say "I'm using {{ compiler("cxx_modules") }} because this project consumes modules", but at that point it's too late to add any constraints, so the only way to do it (at least that I see) is by doing this on the producing side. That's why my original name suggestion was _produces_fortran_modules, to try to make that point clear. I guess we could do something like {{ compiler("fortran_produces_modules") }}, if we want to pay for clarity with verbosity.

h-vetinari avatar May 23 '25 08:05 h-vetinari

The answer (at least based on the solution proposed above) is that you would only need to exchange {{ compiler("cxx") }} with {{ compiler("cxx_modules") }} in recipes that are producing modules (so no migration necessary). This imbues the resulting outputs with the constraints that ensure that when those modules get consumed by some dependent build, it's done with the right compiler.

Ack. But let's say that I am in a feedstock that is consuming as modules glm and fmt (just two projects at the top of https://arewemodulesyet.org/ that I know we have in conda-forge) and so it install glm-devel and fmt-devel, don't we need to make sure that the constraint they have on the compiler version are compatible to be solved for the feedstock to build a host environment for the package build?

traversaro avatar May 23 '25 08:05 traversaro

don't we need to make sure that the constraint they have on the compiler version are compatible to be solved for the feedstock to build a host environment for the package build?

I expect constraints on the compiler version to be rare; I haven't fully thought about that case yet, to be honest, except that I expect it to be possible in some combination of constraints. The big problem is how the foo-devel modules package (where reasonably we'd have to attach the constraint) in host: could be made to conflict with something in build:.

I think it would be possible if we match the version of _fortran_modules <ver> <compiler* to the actual compiler version, e.g. have _fortran_modules-15.1.0-gfortran_0, _fortran_modules-20.1.6-flang_0, etc. and then add an additional _fortran_modules>=15 run-export to gfortran_modules_<target> for v15+.

h-vetinari avatar May 23 '25 08:05 h-vetinari

My bad, I saw the line - {{ pin_subpackage("gfortran_" ~ cross_target_platform, exact=True) }} and I tought the version there was the version of the compiler, if that is not the case that is great!

traversaro avatar May 23 '25 08:05 traversaro

All the pin_subpackage was supposed to communicate is that gfortran_modules_<target> is an extremely thin wrapper around gfortran_<target>, with the only difference being the strong run-export.

h-vetinari avatar May 23 '25 08:05 h-vetinari

Actually, the exercise with thinking through the version constraints exposed a potentially fatal flaw in my current proposal.

We always come back to:

The big problem is how the foo-devel modules package (where reasonably we'd have to attach the constraint) in host: could be made to conflict with something in build:.

And the only way to produce a conflict between build: and host: is a (strong) run-export. But that defeats the purpose, because we wanted to use regular {{ compiler("fortran") }} without a strong run-export for builds that are consuming modules. And in that case, the conflict just doesn't trigger...

So now I'm wondering if a proper solution to all this requires host_exports (which would provide the right tool, because then even the general-purpose Fortran compiler could host-export _fortran_modules * <compiler>*; i.e. forcing the right kind of modules in host: without producing any dependency change in run:).

h-vetinari avatar May 23 '25 08:05 h-vetinari

Actually, the exercise with thinking through the version constraints exposed a potentially fatal flaw in my current proposal.

Aside from adding host_exports, I think the proposal is somewhat salvageable if both module-producing and module-consuming builds are using {{ compiler("fortan_modules") }}, at least with version-locked modules. Because then we could still attach _fortran_modules <version> <compiler>* as a strong run-export, and that would ensure both that the constraint is present on foo-devel, as well as that the constraint is enforced when consumed.

But then the problem is that nothing keeps the bar recipe from just using the "wrong" {{ compiler("fortran") }} (without the run-export, and thus without triggering the constraint if ever the wrong modules end up in host), so this is still not great. Sigh 😑

h-vetinari avatar May 23 '25 09:05 h-vetinari

I brought this up in the last core call https://github.com/conda-forge/conda-forge.github.io/blob/dcaaacc84ad4dec8ac5f9916a78301ef777a6381/community/minutes/2025-05-28.md?plain=1#L60-L62

@isuruf had some questions/comments whether this would really need host_exports. Most of my considerations on this are in the OP - please let me know what aspects are unclear.

h-vetinari avatar Jun 06 '25 08:06 h-vetinari

Ping @isuruf. I think we'll need a CEP for host-exports, but before I start down that road, I want to make sure there are no objections from your side. Counting https://github.com/conda-forge/flang-activation-feedstock/issues/14, this has been a discussion that's been going on for many months already, could you please help find a way forward.

h-vetinari avatar Jun 17 '25 01:06 h-vetinari

But that defeats the purpose, because we wanted to use regular {{ compiler("fortran") }} without a strong run-export for builds that are consuming modules. And in that case, the conflict just doesn't trigger...

Why do we need the regular compiler? We need a special one so that -devel package gets the _fortran_modules as well.

Can you document here how a recipe that produces modules and a recipe that consumes modules should look like?

isuruf avatar Jun 17 '25 01:06 isuruf

Can you document here how a recipe that produces modules and a recipe that consumes modules should look like?

Sure. The producing side should be

outputs:
  - name: foo-devel
    requirements:
      build:
        - {{ stdlib("c") }}
        # modules-compiler for imbuing foo-devel with constraint on compiler ABI through strong run-export
        - {{ compiler("fortran_modules") }}
      host:
        - [...]

and the consuming side should use

  - name: i-consume-fortran-modules
    requirements:
      build:
        - {{ stdlib("c") }}
        # regular compiler because we don't want strong run-export on compiler ABI just for consuming foo-devel
        - {{ compiler("fortran") }}
      host:
        - foo-devel

The problem then is that - because we don't want to have a strong run-export on the regular compiler - there's nothing that causes {{ compiler("fortran") }} and the constraints on foo-devel to conflict between the build & host environments (if the compiler ABI isn't aligned). The only way that I can see this work is if {{ compiler("fortran") }} host-exports _fortran_abi * flang* (or so) which can then be made to conflict with whatever constraint we attach to foo-devel.

h-vetinari avatar Jun 17 '25 01:06 h-vetinari

Sounds good. I would not call it host-exports, just another variant of run-exports as run-exports: strong also adds to host in addition to run.

isuruf avatar Jun 17 '25 17:06 isuruf

Took me a while, but I've now managed to come up with a design and write something up for that; see here.

h-vetinari avatar Aug 14 '25 04:08 h-vetinari