Puffin icon indicating copy to clipboard operation
Puffin copied to clipboard

MPI source error running make

Open Joseph459459 opened this issue 3 years ago • 4 comments

When I try to run make && make install I run into the following type of error in DParallelSetUp.f90

501 | CALL MPI_ALLGATHER(in1,1,MPI_INT_HIGH, & | 1 ...... 523 | CALL MPI_ALLGATHER(in1,1,MPI_DOUBLE_PRECISION, & | 2 Error: Type mismatch between actual argument at (1) and actual argument at (2) (INTEGER(4)/REAL(8)).

I am new to fortran - is there no implicit type-casting?

Attempt at solution: I am on a HPC cluster so I tried to make sure that different MPI versions weren't conflicting. The build itself completed without error. There is only a deprecation warning for CMake, and it also tells me it doesn't use my manually specified fortran compiler. The openmpi version is 4.1.2, gcc is 10. fftw3 is version 3.3.9 and uses openmpi 4.0.5, gcc 10.2.

Joseph459459 avatar May 20 '22 00:05 Joseph459459

Hi! In Fortran there's no implicit type casting in method signatures.

Can you attach the command history and output? I suspect it's the fortran compiler that you're being told is different from the version you're trying to manually specify.

(Another issue here is that we are doing 'include mpif.h' (to support some legacy libs/compilers) to import the MPI lib routines, rather than the more modern 'use MPI' which does proper compile-time checking, so it's probably only caught at the link stage? Changing this won't fix the issue you're having though, I don't think...)

mightylorenzo avatar May 21 '22 00:05 mightylorenzo

Hi mightylorenzo,

I have successfully compiled Puffin using gcc version 8.2, openmpi version 2.1, and HDF5 1.8.

However, now I'm encountering the following kind of linking errors caused by HDF5:

[  1%] Linking Fortran executable puffin 
CMakeFiles/puffin.dir/H5in.f90.o: In function '__h5in_MOD_readh5fieldfilesingledump':
H5in.f90:(.text+0x1be): undefined reference to 'h5pset_fapl_mpio_f_''
H5in.f90:(.text+0xc27): undefined reference to 'h5pset_dxpl_mpio_f_'
H5in.f90:(.text+0x1115): undefined reference to ''h5pset_dxpl_mpio_f_''

Here is the corresponding cmake output log for the successful compile / unsuccessful linking: cmake-puffin.txt

It would seem that all the correct libraries are recognized by scimake. Is this perhaps an HDF5 version compatibility issue?

EDIT:

I believe the issue is that this HDF5 is not built to be parallelized. I will try a parallel version and then report back.

Joseph459459 avatar May 23 '22 15:05 Joseph459459

After using the parallel version of HDF5, the program works.

However, I tried using openmpi 4.1 and gcc 10 again on my universities brand new cluster and it failed with the same compile error described in my first comment.

Building and running with openmpi 4.0 and gcc 9 was successful - could it be perhaps that some niche update is leading to the aforementioned type mismatch?

Attached is the cmake/scimake log for the failed compile: cmake_log.txt

Joseph459459 avatar May 24 '22 15:05 Joseph459459

It could be, although those constants are supplied by the MPI lib itself. Usually I see those types of errors are when scimake/cmake gets confused about which versions of each lib to point to. If possible, can you post the whole compile/link command history and output (including output of running make etc)?

mightylorenzo avatar May 26 '22 19:05 mightylorenzo