gloo icon indicating copy to clipboard operation
gloo copied to clipboard

output of reduce_scatter is incorrect

Open wdfnst opened this issue 3 years ago • 1 comments

#include <iostream>
#include <memory>

#include "gloo/allreduce_ring.h"
#include "gloo/reduce_scatter.h"
#include "gloo/rendezvous/context.h"
#include "gloo/rendezvous/file_store.h"
#include "gloo/rendezvous/prefix_store.h"
#include "gloo/transport/tcp/device.h"
int main(){
  int num_elements = 12;
  int buffer_data[] = {1, 2, 3, 4, 5, 6, 11, 12, 13, 14, 15, 16};
  std::vector<int*> sendbuf;
  std::cout << context->rank << "before-send:";
  for (int i = 0; i < num_elements; i++) {
      sendbuf.push_back( &((int*)buffer_data)[i]);
//       std::cout << ((int*)buffer_data)[i] << " ";
      std::cout << *(sendbuf[i]) << " ";
  }
  std::cout << std::endl;
  std::vector<int> recvcountsbuff({6, 6});
  gloo::ReduceScatterHalvingDoubling<int> rs_hd(
          context,
          sendbuf,
          num_elements,
          recvcountsbuff
          );
  rs_hd.run();
  std::cout << context->rank << "after-send:";
  for (int i = 0; i < num_elements / 2; i++) {
      std::cout << ((int*)buffer_data)[i] << " ";
//       std::cout << *(sendbuf[i]) << " ";
  }
  std::cout << std::endl;`
return 0;
}

compile:

g++ -lstdc++ --std=c++11 example2.cc libgloo.a -ldl -pthread -o example2

run on node-1: env PREFIX="aaa" SIZE=2 RANK=0 ./example2 run on node-2: env PREFIX="aaa" SIZE=2 RANK=1 ./example2

expected output: rank-0: [2, 4, 6, 8, 10, 12] rank-1: [11, 24, 26, 28, 30, 32] actual output: rank-0: [862031072 862031072 862031072 862031072 862031072 862031072] rank-1: [197264104 197264104 197264104 197264104 197264104 197264104]

wdfnst avatar Mar 23 '21 07:03 wdfnst

Your arguments to the constructor of ReduceScatterHalvingDoubling don't make much sense like this.

Something like the following should work:

  gloo::ReduceScatterHalvingDoubling<int> rs_hd(
          context,
          std::vector<int*>{buffer_data},  // <- vector holding just one pointer
          num_elements,
          recvcountsbuff
          );
  rs_hd.run();

Afterwards buffer_data will hold the scattered reduced data.

maxhgerlach avatar Dec 02 '21 00:12 maxhgerlach