quda Reduction abstraction needs to be applied to choice of MPI reducer in `TunableReduction`

Reduction abstraction needs to be applied to choice of MPI reducer in `TunableReduction`

Open maddyscientist opened this issue 5 years ago • 0 comments

Reduction abstraction is presently broken for non-summation reductions. While the abstracted launch can be passed different reducers for the kernel, the MPI reduction presently assumes that summation is being performed. This will break, for example, force monitoring, max element computing for half precision multigrid, etc.

      if (!commAsyncReduction()) {
        arg.complete(result, stream);
        if (!activeTuning() && commGlobalReduction()) {
          // FIXME - this will break when we have non-summation reductions (MG fixed point will break and so will force monitoring)                                                                                                                                                                                         
          comm_allreduce_array((double*)result.data(), result.size() * sizeof(T) / sizeof(double));
        }
      }

Nov 01 '20 20:11 maddyscientist

quda quda copied to clipboard

Reduction abstraction needs to be applied to choice of MPI reducer in `TunableReduction`

quda
quda copied to clipboard