quda icon indicating copy to clipboard operation
quda copied to clipboard

Fused DWF + NVSHMEM

Open hummingtree opened this issue 2 years ago • 1 comments

~~This PR makes the DWF fused kernels run with NVSHMEM. Running on one node on Selene with 1x2x2x2 and Ls = 12, getting (performance numbers are in GFLOPS)~~ No there is no speed up.

hummingtree avatar Aug 04 '22 19:08 hummingtree

Thanks @maddyscientist. I have added update the doxygen to cover active, and also applied the changes to constantInv in https://github.com/lattice/quda/pull/1310/commits/4040ff814f9a4e8aa4b3088313e8cea103557aef. I have also tested constantInv code path.

hummingtree avatar Aug 08 '22 15:08 hummingtree