ArborX icon indicating copy to clipboard operation
ArborX copied to clipboard

Add a faster implementation of the union-find for Serial

Open aprokop opened this issue 2 years ago • 1 comments

The patch tries to solve the problem of slow Serial atomics when Kokkos compiled with other backends. This patch is not strictly necessary right now, as one could compile Kokkos with just the Serial backend. However, it becomes necessary once I introduce a dendrogram construction algorithm using union-find, which will run MST construction on the device, but construct union-find on CPU using a single thread. In that context, it's not possible to fix it by just compiling Kokkos.

Drive-by change: labels_ -> _labels

aprokop avatar Oct 14 '22 19:10 aprokop

The following code for 20,000,000 edges went from 0.46 to 0.23 when using Serial backend but compiled with others.

    Kokkos::parallel_for(
        "Bechmark::union-find",
        Kokkos::RangePolicy<ExecutionSpace>(ExecutionSpace{}, 0, num_edges),
        KOKKOS_LAMBDA(int e) {
          int i = edges(e).source;
          int j = edges(e).target;

          for (int k : {i, j})
          {
            auto child = labels(union_find.representative(k));
            if (child != UNDEFINED)
              parents(child) = e;
          }

          union_find.merge(i, j);
          labels(union_find.representative(i)) = e;
        });

aprokop avatar Oct 14 '22 23:10 aprokop