DLA-Future
DLA-Future copied to clipboard
Permute Distributed: Avoid packing/unpacking for local permutations
PR #835 split the management of permutations available locally from the ones that needs the communication to be carried out.
Currently, local permutations skips the communication part, but they are anyway packed into/unpacked from communications buffers. Implementation could be improved further by not packing/unpacking locals (just skipping their copy) and performing their permutations directly from input to output.
The benefits might be small since the change just means skipping copies for local permutations.