cutlass
cutlass copied to clipboard
Fix register index bug in mma.sync.aligned.m16n8k16
In include/cutlass/arch/mma_sm90.h the ptx instruction mma.sync.aligned.m16n8k16 has a typo in the variable for %5 which should be A[1] but is currently A[2] (and hence using A[2] twice and A[1] not at all).
This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.
@hwu36 this should be a very quick review (single typo).