queso icon indicating copy to clipboard operation
queso copied to clipboard

Confusing MPI `FullEnvironment` logic

Open dmcdougall opened this issue 7 years ago • 5 comments

We have this constructor which the user can call if they don't want to use MPI, but if the user doesn't pass --disable-mpi to configure (i.e., QUESO_HAVE_MPI is defined), how should this constructor behave?

We call it here, in our tests and it's not entirely clear what it's supposed to do.

dmcdougall avatar May 26 '17 22:05 dmcdougall

The only valid interpretation I can think of is "QUESO behaves as if it was configured in serial"; how equipped we are to do that I don't know. Using MPI_COMM_SELF might work as a hack for now?

roystgnr avatar May 26 '17 22:05 roystgnr

I think we should have a dummy communicator class. Right now we're trying to accommodate both behaviours in a single class. Here's why. There are three things that could be true or false:

  1. QUESO_HAVE_MPI
  2. The user executed their application with mpi(run|exec)
  3. The user called MPI_Init()

We're trying to understand what should happen if the user uses the serial FullEnvironment constructor.

Scenario A

If 1, 2 and 3 are all true then the dummy communicator will basically just use MPI_COMM_SELF. Only process 0 on MPI_COMM_WORLD should write output. We'll have to make sure to enforce that, otherwise we'll have a race condition for the output file. This should be fairly simple since we have access to all the ranks.

Scenario B

If 1 and 2 are true but 3 is false then we don't have access to the ranks like we do in scenario A. So here we should either have QUESO call MPI_Init() or throw an exception. If QUESO should call MPI_Init() then we have access to the ranks and can fall back to Scenario A remembering that QUESO should also call MPI_Finalize().

Scenario C

If 1 is true, 2 is false and 3 is true then there is only one process and we can fall back to scenario A.

Scenario D

If 1 is true but 2 and 3 are both false then QUESO has no way of knowing there is only one process. So our choices are to call MPI_Init() or throw an exception. If QUESO calls MPI_Init() then it should also call MPI_Finalize().

Scenario E

If 1 is false but 2 and 3 are both true then QUESO should use the dummy communicator (MPI_COMM_SELF). How do we prevent race conditions on the output? We could require the user to manage this by programmatically setting the location of the output themselves.

Scenario F

If 1 is false, 2 is true, and 3 is false this is nonsense, but QUESO will use the dummy communicator. The user can't even set their output locations programmatically.

Scenario G

If 1 and 2 are both false but 3 is true then there is only one process, there are no race conditions on the output files, QUESO uses the dummy communicator and everything is fine.

Scenario H

If 1, 2, and 3 are all false then we can fall back to scenario G.

Does this all make sense? I think the point of this comment is to establish the fact we need a dummy communicator class because right now we only determine 'serialness' if QUESO_HAVE_MPI is false.

dmcdougall avatar Jun 07 '17 00:06 dmcdougall

The way you've written things, "2 is false" is pretty iffy - the MPI standard allows implementations to both provide and require a special startup command. IIRC I once worked with an MPI stack where mpirun -np 1 ./mycode worked fine but ./mycode was a guaranteed segfault in MPI_Init().

So I'll talk as if "2 is false" means "the code was invoked on one processor" and "2 is true" means "the code was invoked on two or more processors".

Scenario A: In this case every processor is effectively processor 0, and should write output etc., and that had better be what the user wanted, because that's exactly what they asked for. It's not unreasonable that users might have a parallel code but want to invoke QUESO on individual processors in serial. However, MPI_COMM_SELF is not useful in this case, because MPI operations which can only follow MPI_Init still can only follow MPI_Init even if they're operating over MPI_COMM_SELF!

Scenario B: In this case, libMesh calls MPI_Init() to make things easier on the users; throwing an exception is also a reasonable thing to do though.

Scenario C, D: Correct.

Scenario E,F,G,H: If 1 is false, then QUESO can't even safely include mpi.h, so MPI_COMM_SELF is out of the question.

Okay, it looks like my MPI_COMM_SELF advice earlier turned out to be a red herring. We need a serial "dummy communicator" that works the way libMesh's does: without relying on MPI at all. I'd consider seriously just swiping parallel.h (and its two header dependencies) from libMesh and making them "package-independent" with the same handful of changes we used for GetPot.

roystgnr avatar Jun 07 '17 17:06 roystgnr

Regarding item 2, I had assumed ./mycode was interchangeable with mpirun -np 1 ./mycode so thanks for clarifying that.

Scenario B: Ok, cool. Honestly I'd prefer it if the environment called MPI_Init() and MPI_Finalize() for the user, but I don't really care either way, as long as it is documented and the user knows what is going on.

Scenario E, F, G, H: Oh, of course! It seems while I was writing up the tree of possible outcomes, I totally overlooked that simple fact. Thanks for pointing that out.

Dummy communicator it is. Regarding swiping libmesh code, I'm looking at these files:

  1. parallel.h
  2. parallel_implementation.h
  3. parallel_communicator_specializations

Am I missing any other files?

dmcdougall avatar Jun 07 '17 18:06 dmcdougall

With every MPI stack I've seen in at least the last 5 years, those are interchangeable, but even the MPI 3.0 standard doesn't guarantee it.

Yeah, I prefer doing the MPI_Init() calls too. There's no downside vs. throwing an exception, and vs. proceeding in serial the only downside is if the user wants to instantiate another MPI_Init() calling library and the second library doesn't check MPI_Initialized() first, and even in that case the easy fix would be to swap the order of initialization.

Those are the three files, yes.

roystgnr avatar Jun 07 '17 19:06 roystgnr