Kernels icon indicating copy to clipboard operation
Kernels copied to clipboard

Fenix new

Open rfvander opened this issue 7 years ago • 18 comments

Fenix_new adds Fenix versions of Stencil, Transpose, AMR, and Sparse to the repo, using split main programs to make sure the compiler can optimize the performance sensitive part of the kernels. The implementations have been tested with OpenMPIv17.1

rfvander avatar Jul 25 '17 17:07 rfvander

It is not important but OpenMPIv17.1 doesn't exist. Do you mean Open-MPI 1.7.1?

It sounds like I need to add Open-MPI to Travis.

jeffhammond avatar Jul 25 '17 17:07 jeffhammond

Apologies. Indeed, OpenMPI 1.7.1 is the version required at present. It is obtained using Mercurial. I will send instructions. The Fenix library is also required. I will send instructions for that as well.

rfvander avatar Jul 25 '17 17:07 rfvander

Please fix issues with time_step/timestep.o. Thanks.

/home/travis/build/ParRes/Kernels/PRK-deps/bin/mpicc -std=c99 -O3 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c amr.c
amr.c: In function ‘main’:
amr.c:475:75: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
       printf("ERROR: rank %d's BG work tile smaller than stencil radius: %d\n",
                                                                           ^
amr.c:700:75: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘long int’ [-Wformat=]
       printf("ERROR: rank %d's work tile %d smaller than stencil radius: %d\n",
                                                                           ^
amr.c:815:6: warning: implicit declaration of function ‘time_step’ [-Wimplicit-function-declaration]
      time_step(Num_procs, Num_procs_bg, Num_procs_bgx, Num_procs_bgy,
      ^~~~~~~~~
/home/travis/build/ParRes/Kernels/PRK-deps/bin/mpicc -std=c99 -O3 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c ../../common/MPI_bail_out.c
/home/travis/build/ParRes/Kernels/PRK-deps/bin/mpicc -std=c99 -O3 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c ../../common/wtime.c
make[1]: *** No rule to make target `timestep.c', needed by `timestep.o'.  Stop.
make[1]: Leaving directory `/home/travis/build/ParRes/Kernels/MPI1/AMR'
make: *** [allmpi1] Error 2

jeffhammond avatar Jul 27 '17 17:07 jeffhammond

Thanks for getting cracking on my pull request. I'll figure out what went wrong and fix it.

rfvander avatar Jul 27 '17 17:07 rfvander

Can you check to see if there is a conflict in make.common? I have this in the version that should have been committed and push to branch Fenix_new:

...
timestep.o: timestep.c
        $(CCOMPILER) $(CFLAGS) $(TUNEFLAGS) $(INCLUDEPATHSPLUS) -c $<
 
wtime.o:$(COMMON)/wtime.c
        $(CCOMPILER) $(CFLAGS) $(TUNEFLAGS) $(INCLUDEPATHSPLUS) -c $<
 
random_draw.o:$(COMMON)/random_draw.c
        $(CCOMPILER) $(CFLAGS) $(TUNEFLAGS) $(INCLUDEPATHSPLUS) -c $<
 
MPI_bail_out.o:$(COMMON)/MPI_bail_out.c
        $(CCOMPILER) $(CFLAGS) $(TUNEFLAGS) $(INCLUDEPATHSPLUS) -c $<
...

rfvander avatar Jul 27 '17 17:07 rfvander

Just push to your branch and GitHub will determine if there is a conflict.

jeffhammond avatar Jul 27 '17 17:07 jeffhammond

That was a typo on my part. I had already pushed my branch.

rfvander avatar Jul 27 '17 18:07 rfvander

@rfvander The problem is that timestep.c lives in PRK/FENIX/AMR/ not PRK/MPI1/AMR/. Also, it depends on fenix.h.

You need to implement MPI1's AMR without Fenix-related dependencies.

jrhammon-mac01:AMR jrhammon$ make amr
mpicc -std=c99 -g -O3 -mtune=native -ffast-math -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c timestep.c
In file included from timestep.c(34):
../../include/par-res-kern_fenix.h(34): catastrophic error: cannot open source file "fenix.h"
  #include <fenix.h>

jeffhammond avatar Aug 17 '17 22:08 jeffhammond

Fixed in fa2ce89

jeffhammond avatar Aug 17 '17 22:08 jeffhammond

I'm not sure I understand. I had pulled the body out of the main iteration loop only for the Fenix implementations of the PRKs, and in that case Fenix-related dependencies would always be resolved. I had not done this for MPI1, so there should not be a problem. Of course, if you want to reuse that timestep.c file for MPI1, which is legit, we should make sure there are no Fenix dependencies. None is needed for timestep.c, because no Fenix calls are made inside. If I put fenix.h in timestep.c, it can be removed.

rfvander avatar Aug 17 '17 23:08 rfvander

Try to build allmpi1 without the change in my patch. I don't see how it can succeed.

jeffhammond avatar Aug 17 '17 23:08 jeffhammond

MPI1/AMR/amr.c separated time_step into a separate function as well...

jeffhammond avatar Aug 17 '17 23:08 jeffhammond

That is actually a matter of garbage collection. File timestep.c in that directory is not used to build the kernel. It must be left over from my work on Fenix, and should be removed.

[rfvander@esgmonster esg-prk-devel]$ make allmpi1 PRK_FLAGS=-std:c99
cd MPI1/Synch_global;        make global    "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Synch_global'
mpiicc -std:c99 -DMPI -DVERBOSE=0    -I../../include -c global.c
mpiicc -std:c99 -DMPI -DVERBOSE=0    -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DVERBOSE=0    -I../../include -c ../../common/wtime.c
mpiicc -o global   -std:c99 -DMPI global.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Synch_global'
cd MPI1/Synch_p2p;           make p2p       "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Synch_p2p'
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c p2p.c
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/wtime.c
mpiicc -o p2p   -std:c99 -DMPI p2p.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Synch_p2p'
cd MPI1/Sparse;              make sparse    "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Sparse'
mpiicc -std:c99  -DMPI -DVERBOSE=0  -DSCRAMBLE=1 -DTESTDENSE=0 -DRESTRICT_KEYWORD=0  -I../../include -c sparse.c
mpiicc -std:c99  -DMPI -DVERBOSE=0  -DSCRAMBLE=1 -DTESTDENSE=0 -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99  -DMPI -DVERBOSE=0  -DSCRAMBLE=1 -DTESTDENSE=0 -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/wtime.c
mpiicc -o sparse   -std:c99  -DMPI sparse.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Sparse'
cd MPI1/Transpose;           make transpose "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Transpose'
mpiicc -std:c99 -std=c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0 -DSYNCHRONOUS=0  -I../../include -c transpose.c
mpiicc -std:c99 -std=c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0 -DSYNCHRONOUS=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -std=c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0 -DSYNCHRONOUS=0  -I../../include -c ../../common/wtime.c
mpiicc -o transpose   -std:c99 -std=c99 -DMPI transpose.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Transpose'
cd MPI1/Stencil;             make stencil   "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Stencil'
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0 -DLOOPGEN=0 -DDOUBLE=1   -DRADIUS=2 -DSTAR=1   -I../../include -c stencil.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0 -DLOOPGEN=0 -DDOUBLE=1   -DRADIUS=2 -DSTAR=1   -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0 -DLOOPGEN=0 -DDOUBLE=1   -DRADIUS=2 -DSTAR=1   -I../../include -c ../../common/wtime.c
mpiicc -o stencil   -std:c99 -DMPI stencil.o MPI_bail_out.o wtime.o  -lm
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Stencil'
cd MPI1/DGEMM;               make dgemm     "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/DGEMM'
mpiicc -std:c99  -DMPI -DBOFFSET=12 -DVERBOSE=0   -I../../include -c dgemm.c
mpiicc -std:c99  -DMPI -DBOFFSET=12 -DVERBOSE=0   -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99  -DMPI -DBOFFSET=12 -DVERBOSE=0   -I../../include -c ../../common/wtime.c
mpiicc -o dgemm   -std:c99  -DMPI dgemm.o MPI_bail_out.o wtime.o   -lm
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/DGEMM'
cd MPI1/Nstream;             make nstream   "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Nstream'
mpiicc -std:c99 -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c nstream.c
mpiicc -std:c99 -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/wtime.c
mpiicc -o nstream   -std:c99 -DMPI nstream.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Nstream'
cd MPI1/Reduce;              make reduce    "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Reduce'
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c reduce.c
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DVERBOSE=0  -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/wtime.c
mpiicc -o reduce   -std:c99 -DMPI reduce.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Reduce'
cd MPI1/Random;              make random    "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Random'
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DLONG_IS_64BITS=0 -DVERBOSE=0      -DLOOKAHEAD=1024  -I../../include -c random.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DLONG_IS_64BITS=0 -DVERBOSE=0      -DLOOKAHEAD=1024  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DLONG_IS_64BITS=0 -DVERBOSE=0      -DLOOKAHEAD=1024  -I../../include -c ../../common/wtime.c
mpiicc -o random   -std:c99 -DMPI random.o MPI_bail_out.o wtime.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Random'
cd MPI1/Branch;              make branch    "DEFAULT_OPT_FLAGS   = -std:c99"  \
                                                       "MATRIX_RANK         = 5"        \
                                                       "NUMBER_OF_FUNCTIONS = 40"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/Branch'
mpiicc -std:c99  -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0   -I../../include -c branch.c
mpiicc -std:c99  -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0   -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99  -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0   -I../../include -c ../../common/wtime.c
mpiicc -std:c99  -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0   -I../../include -c func.c
mpiicc -o branch   -std:c99  -DMPI branch.o MPI_bail_out.o wtime.o func.o  
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/Branch'
cd MPI1/PIC-static;          make pic       "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/PIC-static'
mpiicc -std:c99  -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c pic.c
mpiicc -std:c99  -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99  -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/wtime.c
mpiicc -std:c99  -DMPI -DVERBOSE=0   -DRESTRICT_KEYWORD=0  -I../../include -c ../../common/random_draw.c
mpiicc -o pic   -std:c99  -DMPI pic.o MPI_bail_out.o wtime.o random_draw.o  -lm
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/PIC-static'
cd MPI1/AMR;                 make amr       "DEFAULT_OPT_FLAGS   = -std:c99"
make[1]: Entering directory `/home/rfvander/esg-prk-devel/MPI1/AMR'
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c amr.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c ../../common/MPI_bail_out.c
mpiicc -std:c99 -DMPI -DRESTRICT_KEYWORD=0 -DVERBOSE=0  -DDOUBLE=1   -DRADIUS=2  -DSTAR=1 -DLOOPGEN=0   -I../../include -c ../../common/wtime.c
mpiicc -o amr   -std:c99 -DMPI amr.o MPI_bail_out.o wtime.o  -lm
make[1]: Leaving directory `/home/rfvander/esg-prk-devel/MPI1/AMR'
[rfvander@esgmonster esg-prk-devel]$ 

rfvander avatar Aug 17 '17 23:08 rfvander

Then where does https://github.com/ParRes/Kernels/blob/Fenix_new/MPI1/AMR/amr.c#L815 come from?

jeffhammond avatar Aug 17 '17 23:08 jeffhammond

049f59ab89c7ea4d7530d80a1b3729cfc558736a is when the definition of time_step was removed from that file...

commit 049f59ab89c7ea4d7530d80a1b3729cfc558736a
Author: rfvander <[email protected]>
Date:   Fri Jun 9 17:10:52 2017 -0700

    Making structure of MPI1 and FENIX version of AMR identical. This alo required making certain functions in header files static to avoid duplicates.

jeffhammond avatar Aug 17 '17 23:08 jeffhammond

Ah, I was on branch master. I will check this out later tonight. Sorry.

rfvander avatar Aug 17 '17 23:08 rfvander

It builds clean now but time_step declaration didn't match definition when I tried to fix it. Since it has 4700 arguments, I'll let you handle that issue.

jeffhammond avatar Aug 18 '17 05:08 jeffhammond

OK, I'll take care of it. The reason MPI1/AMR has the split structure is because I wanted to create a simple example for the Fenix version. I should have reverted to the original form, but somehow managed to delete it. But we could have a longer discussion about the desirability of the split. I purposely tried to keep the kernels flat (a single file and, if possible, a single function, except library calls), so that the compiler always has maximum knowledge without the user having to use compiler-specific flags to enable cross function and cross module optimization. This worked fine when the kernels were very simple. But AMR and PIC stretched that and couldn't reasonably be implemented as one function. So I introduced a few functions (smallest possible). Then Fenix came along, which was actually hurt by a flat file/single function. setjmp (in macro Fenix_Init) disables compiler optimizations within its lexical scope, so it had to be isolated. Most logical was to extract the portion that was time-critical, namely the time step. We could decide to isolate the time step in all kernels (this would be quite a lot). I prefer consistency, so all or nothing. Simplest at present is to let MPI1/AMR revert to the original structure for now, and only do the split for Fenix, where we have an overriding reason to do it.

rfvander avatar Aug 18 '17 05:08 rfvander