devito icon indicating copy to clipboard operation
devito copied to clipboard

MPI gather for TimeFunction

Open jkwashbourne opened this issue 2 years ago • 2 comments

MPI data_gather broken for TimeFunction. MFE below

# env DEVITO_LOGGING=DEBUG DEVITO_MPI=1 mpirun -n 2 python mfe_gather2.py

import numpy as np
from devito import Grid, TimeFunction, Eq, Operator, MPI

grid = Grid(extent=(1,1,1), shape=(11,11,11), origin=(0.0,0.0,0.0), dtype=np.float32)

ft = TimeFunction(name='ft', grid=grid, time_order=1, space_order=2)

op = Operator([Eq(ft.forward,ft+1)], name="Op")
op.apply(time_m=1, time_M=10)

# works
ft_gather1 = ft.data[1]._gather(rank=0)

print("ft.data.shape;         ", ft.data.shape)
print("ft.data[1].shape;      ", ft.data[1].shape)
print("ft_gather1.data.shape; ", ft_gather1.data.shape)

# does not work ... error about dimensionality
ft_gather2 = ft.data_gather(rank=0)

jkwashbourne avatar Mar 29 '22 04:03 jkwashbourne

additional comment: isnt a very high performance version of this gather method absolutely essential for practical use of MPI at scale?

jkwashbourne avatar Mar 29 '22 04:03 jkwashbourne

A fix for this issue is now present in https://github.com/devitocodes/devito/tree/fix_mpi_slicing_p3.

It just needs some tidying and tests added (and then to join the current PR backlog!!).

In the same branch I'll also add a high performance gather implementation.

rhodrin avatar Jun 24 '22 14:06 rhodrin

I should note that this is now fixed in fix_mpi_slicing_p2. PR needs some tidying but is mostly ready.

rhodrin avatar Aug 31 '22 17:08 rhodrin

Fixed by https://github.com/devitocodes/devito/pull/1949

georgebisbas avatar Sep 19 '22 11:09 georgebisbas