quda icon indicating copy to clipboard operation
quda copied to clipboard

Reduction abstraction needs to be applied to choice of MPI reducer in `TunableReduction`

Open maddyscientist opened this issue 5 years ago • 0 comments

Reduction abstraction is presently broken for non-summation reductions. While the abstracted launch can be passed different reducers for the kernel, the MPI reduction presently assumes that summation is being performed. This will break, for example, force monitoring, max element computing for half precision multigrid, etc.

      if (!commAsyncReduction()) {
        arg.complete(result, stream);
        if (!activeTuning() && commGlobalReduction()) {
          // FIXME - this will break when we have non-summation reductions (MG fixed point will break and so will force monitoring)                                                                                                                                                                                         
          comm_allreduce_array((double*)result.data(), result.size() * sizeof(T) / sizeof(double));
        }
      }

maddyscientist avatar Nov 01 '20 20:11 maddyscientist