cutlass
cutlass copied to clipboard
Avoid LDGSTS routing by changing default copy to be universalcopy
https://github.com/NVIDIA/cutlass/issues/1672
This PR changes the default copy to be UniversalCopy
so the LDGSTS instruction is avoided, and downstream users will need to specify the copy type if they want to use it which is more intuitive.
Note: I imagine since all of the tests and examples that are configured to use DefaultCopy
will need to now transition over to AutoVectorizingCopyWithAssumedAlignment<128>
as that was the previous copy type. This way we can preserve behavior across benchmarks and tests. Before I make a bunch of changes across files though I'd like to get feedback now incase I'm missing anything.