opm-grid Add LGRs on a distributed grid (first steps)

This PR

refactors some methods related to refinement to take into account the case when the grid has been distributed (before refinement),
tests a few cases, adding LGRs after the grid has been distributed,
allows global refinement of a distributed grid "only once" (meaning that it can be invoked only one time with argument equal to 1, for now),
does not imply any updates on the reference manual.

Aug 07 '24 12:08 aritorto

jenkins build this please

Aug 07 '24 12:08 aritorto

jenkins build this please

Aug 07 '24 13:08 aritorto

addLgrsDistGrid.log

I get different output. In any case, I set the PR as draft for now. Screenshot from 2024-08-08 07-55-49

Aug 08 '24 05:08 aritorto

jenkins build this please

Aug 08 '24 07:08 aritorto

jenkins build this please

Aug 08 '24 13:08 aritorto

jenkins build this please

Aug 12 '24 05:08 aritorto

Last commit inspired in suggestion on draft PR from @blattms

Aug 16 '24 08:08 aritorto

jenkins build this please

Aug 19 '24 14:08 aritorto

I'm marking this PR ready for review ,even though I need to work a bit more on the test file, which currently does not fail when you consider "one test case per run". (This explains why there are large commented pieces on the test). It seems that it can be improved selecting in which process each cell should be sent. In the meantime, I run jenkins too.

Aug 22 '24 09:08 aritorto

jenkins build this please

Aug 22 '24 09:08 aritorto

jenkins build this please

Aug 22 '24 12:08 aritorto

Something might need to be fixed since running the all the tests defined in addLgrsOnDistrubutedGrid_test.cpp fail, but running each of them separately, it does not fail.

Aug 23 '24 10:08 aritorto

jenkins build this please

Aug 23 '24 10:08 aritorto

did not look at details but sounds like a fixture initializing mpi et al.

Aug 23 '24 10:08 akva2

thanks @akva2 I'm a bit new into mpi, so I'm not 100% about what you meant. In any case, I also noticed that, when requirements for refinements are not fulfilled, I should "throw in all the processes" (terminate) - not only one of them. I'll fix that soon. Does this happen to be related to your comment?

Aug 23 '24 11:08 aritorto

not really, what i mean is if you have a test fixture which calls MPI_Init / MPI_Finalize (directly or indirectly), you are in trouble because those can only be called once per program invocation. So it would work fine running one test, but since the fixture is constructed / destructed once per test, you'd be in trouble running multiple tests.

Aug 23 '24 11:08 akva2

Thanks a lot for the explanation, so these last commits where I separated the text cases (one per file) in the end might make sense, if I got it correctly. I'll think about it! Thanks again @akva2

Aug 23 '24 11:08 aritorto

you can see an example of the typical way of solving this in https://github.com/OPM/opm-simulators/blob/master/tests/test_broadcast.cpp

in particular note the use of BOOST_TEST_NO_MAIN and implementation of a main method.

Aug 23 '24 11:08 akva2

@akva2 We actually found out what the problem was:

It seems like Zoltan has some kind of internal state (maybe a random number generator) and you cannot rely on the partitioning being the same the second time you call it (or if you previously computed another partitioning in addition).

I guess this might even be a feature: If you compute a partitioning and later realize that it was rubbish, you could start over and hope to get a better partitioning by chance. :sweat_smile:

Even for our small example computing another partitioning before has quite an impact:

Below I print the owned cell indices per rank.

With an additional previous partitioning call to zoltan: rank 0:0 1 4 5 8 9 16 20 21 rank 1:12 13 17 24 25 28 29 32 33 rank 2:2 3 6 7 10 11 18 22 23 rank 3:14 15 19 26 27 30 31 34 35

Without the previous partitioning: rank 0:14 15 18 19 26 27 30 31 35 rank 1:2 3 6 7 10 11 22 23 34 rank 2:12 13 16 17 24 25 28 29 33 rank 3:0 1 4 5 8 9 20 21 32

Hence we need to force our partitions in this case to make this stable.

Aug 26 '24 07:08 blattms

ah that old fun. there's a PHG_RANDOMIZE_INPUT parameter that might help. it could also be LB_APPROACH not set to PARTITION.

Aug 26 '24 07:08 akva2

Cool, thanks. The result may still differ between different versions, etc.

Aug 26 '24 07:08 blattms

Thanks @akva2 and @blattms for your feedback. The PR is ready for review, after a jenkins check

Aug 26 '24 11:08 aritorto

jenkins build this please

Aug 26 '24 11:08 aritorto