dash icon indicating copy to clipboard operation
dash copied to clipboard

Is it possible to create a local matrix per MPI process

Open Goon83 opened this issue 5 years ago • 6 comments

Hi, Is it possible to create a local matrix per MPI process? It is like MPI_COMM_SELF for file creation. Then each MPI process access its own data.

Thanks, Bin

Goon83 avatar Feb 03 '20 16:02 Goon83

@Goon83 I have not tried this but maybe something along these lines works:

auto team = dash::Team::All().split(dash::util::Locality::Scope::Unit, dash::size());

This should give you a team with only one unit each. You can pass this team to the c'tor of the matrix and that should create independent matrices in each unit.

devreal avatar Feb 03 '20 17:02 devreal

Hi Devreal, I tried the following code but it reports the below error. Any hint to write it in right way? Followed the doc to any provide a single argument to split. Nor sure it is right or not.

Code: auto team_local = dash::Team::All().split(dash::Team::All().size());

Error:

 error: calling a protected constructor of class 'dash::Team'
  auto team_local = dash::Team::All().split(dash::Team::All().size());

Goon83 avatar Feb 18 '20 19:02 Goon83

Just realized that split returns a Team&. Can you try auto& instead?

auto& team_local = dash::Team::All().split(dash::Team::All().size());

devreal avatar Feb 18 '20 23:02 devreal

Hi devreal, Thanks for the information. I came up with a simple test code but it still complains with some error. See below for details. Did I do something wrong ?

Bests, Bin

Sample code:

int main(int argc, char *argv[])
{
    dash::init(&argc, &argv);

    dash::TeamSpec<2> teamspec;
    teamspec.balance_extents();

    dash::global_unit_t myid = dash::myid();

    size_t rows = 8;
    size_t cols = 8;

    auto &team_local = dash::Team::All().split(dash::Team::All().size());
    dash::Matrix<int, 2> matrix_local(dash::SizeSpec<2>(rows, cols), dash::DistributionSpec<2>(), team_local, teamspec);

    if (0 == myid)
    {
        cout << "matrix_local size: " << matrix_local.extent(0)
             << " x " << matrix_local.extent(1)
             << " == " << matrix_local.size()
             << endl;
    }

    dash::Team::All().barrier();

    for (size_t i = 0; i < rows; i++)
    {
        for (size_t k = 0; k < cols; k++)
        {
            matrix_local[i][k] = myid;
        }
    }

    for (size_t i = 0; i < rows; i++)
    {
        for (size_t k = 0; k < cols; k++)
        {
            int value = matrix_local[i][k];
            int expected = myid;
            DASH_ASSERT(expected == value);
        }
    }

    int value = matrix_local[5][5];
    cout << value << " at rank " << myid << "\n";

    dash::Team::All().barrier();

    dash::finalize();

    return 0;
}


Error Info:

mpirun -n 2 ./test-dash-local

matrix_local size: 8 x 8 == 64
Fatal error in MPI_Put: Invalid rank, error stack:
MPI_Put(161): MPI_Put(origin_addr=0x7ffeec1e525c, origin_count=1, MPI_INT, target_rank=1, target_disp=0, target_count=1, MPI_INT, win=0xa0000003) failed
MPI_Put(136): Invalid rank has value 1 but must be nonnegative and less than 1
Fatal error in MPI_Put: Invalid rank, error stack:
MPI_Put(161): MPI_Put(origin_addr=0x7ffee3dec25c, origin_count=1, MPI_INT, target_rank=1, target_disp=0, target_count=1, MPI_INT, win=0xa0000003) failed
MPI_Put(136): Invalid rank has value 1 but must be nonnegative and less than 1

Goon83 avatar Feb 19 '20 19:02 Goon83

Sorry, I didn't see your reply until just now. The problem is that your teamspec uses a different team (dash::Team::All() is the default if you don't pass any team).

This should fix it:

int main(int argc, char *argv[])
{
    dash::init(&argc, &argv);

    dash::global_unit_t myid = dash::myid();

    size_t rows = 8;
    size_t cols = 8;

    auto &team_local = dash::Team::All().split(dash::Team::All().size());

    dash::TeamSpec<2> teamspec{team_local};
    teamspec.balance_extents();

    dash::Matrix<int, 2> matrix_local(dash::SizeSpec<2>(rows, cols), dash::DistributionSpec<2>(), team_local, teamspec);

    if (0 == myid)
    {
        cout << "matrix_local size: " << matrix_local.extent(0)
             << " x " << matrix_local.extent(1)
             << " == " << matrix_local.size()
             << endl;
    }

    dash::Team::All().barrier();

    for (size_t i = 0; i < rows; i++)
    {
        for (size_t k = 0; k < cols; k++)
        {
            matrix_local[i][k] = myid;
        }
    }

    for (size_t i = 0; i < rows; i++)
    {
        for (size_t k = 0; k < cols; k++)
        {
            int value = matrix_local[i][k];
            int expected = myid;
            DASH_ASSERT(expected == value);
        }
    }

    int value = matrix_local[5][5];
    cout << value << " at rank " << myid << "\n";

    dash::Team::All().barrier();

    dash::finalize();

    return 0;
}

We should think about changing the pattern and matrix interface to disallow passing a team and a teamspec to the c'tor to prevent this kind of ambiguity.

devreal avatar Feb 22 '20 09:02 devreal

Hi @devreal , I copied and pasted the new code and it still reports the below error. Do you have idea why the error happens. Thanks.

Bests, Bin

[    0 ERROR ] [ 92091521.343 ] dart_globmem.c           :432  !!! DART: dart_team_memalloc_aligned_full ! Unknown team -1
[    0 ERROR ] [ 11536 ] AllocationPolicy.h       :176  | GlobalAllocationPolicy.do_global_allocate(nlocal)| cannot allocate global memory segment 256
libc++abi.dylib: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc
Abort trap: 6

Goon83 avatar Feb 24 '20 18:02 Goon83