reframe icon indicating copy to clipboard operation
reframe copied to clipboard

Allow defining custom MPI launchers for systems

Open giordano opened this issue 1 year ago • 7 comments

There are some systems I have access to where it'd be beneficial to use custom MPI launchers, different from the standard ones. Latest case I ran into is https://www.dur.ac.uk/icc/cosma/support/rockport/, where mpirun should be called as mpirun $RP_OPENMPI_ARGS, where $RP_OPENMPI_ARGS is an environment variable set by a module, which has to be loaded when running the tests (for example by setting systems.partitions.modules).

I may have time to work on this, but I need some guidance, especially in terms of what the API should be. In general, I want this to be system-specific (so to be customised in the system configuration), I don't want to entangle tests with system-specific details.

giordano avatar Jul 11 '22 09:07 giordano

One issue I just ran into while look into this is that the list of allowed launchers is currently hardcoded in the config schema: https://github.com/reframe-hpc/reframe/blob/431ca172b2074c8f570e27f89e8f377b8572c424/reframe/schemas/config.json#L253-L260

giordano avatar Jul 11 '22 09:07 giordano

Actually, I just realised that if the schema check wasn't enforced (i.e., remove "enum": [...] from the schema), I could simply add

import reframe.core.launchers.mpi as mpi

@mpi.register_launcher('custom_launcher')
class MyLauncher(mpi.MpirunLauncher):
    def command(self, job):
        return [...]

to the config file and this would work out-of-the-box, without further changes needed.

giordano avatar Jul 11 '22 10:07 giordano

Hi @giordano, indeed the schema is probably too restrictive in that aspect and we should relax it. Beyond that, it is really easy to add a custom launcher with the snippet you have just shown. The only downside is that you have to modify your reframe installation with your custom launcher. Would you be ok if we relaxed the schema and added some docs on how to add a new parallel launcher?

vkarak avatar Jul 12 '22 06:07 vkarak

indeed the schema is probably too restrictive in that aspect and we should relax it. Beyond that, it is really easy to add a custom launcher with the snippet you have just shown. The only downside is that you have to modify your reframe installation with your custom launcher.

Adding it to the system configuration file (in Python) is not ok? That seems to work and would let me not touch local installations of reframe (I want to use vanilla ReFrame, not to fork it).

Would you be ok if we relaxed the schema and added some docs on how to add a new parallel launcher?

Yes, that'd fine with me.

giordano avatar Jul 12 '22 09:07 giordano

Adding it to the system configuration file (in Python) is not ok? That seems to work and would let me not touch local installations of reframe (I want to use vanilla ReFrame, not to fork it).

Aha, so you did that in your config file and it worked! That's awesome, I've never tried that actually! 😂 Let me try it and we could have a guide on how to add custom launchers.

vkarak avatar Jul 12 '22 09:07 vkarak

Yes, I also wasn't expecting it to work but it seems it does, and that's perfect for my use case. The important seems to be to register the launcher before using it. An unregistered launcher still throws a useful error message:

reframe: failed to initialize runtime: no such launcher: 'unregistered_launcher'

giordano avatar Jul 12 '22 09:07 giordano

I have also been able to define a new launcher in the configuration. So we will relax the configuration schema and add a small tutorial on how to define a new custom launcher.

vkarak avatar Aug 05 '22 15:08 vkarak