cutlass
cutlass copied to clipboard
[FEA] Add Prefetching Hints Support for Global Memory Loading
Is your feature request related to a problem? Please describe. Cutlass has integrated Level 2 (L2) prefetch hints for global memory load to register (LDG) in its implementation(code at https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/arch/memory.h#L161). However, Cute has not incorporated this feature. I speculate that the inclusion of L2 prefetch hints may be beneficial for older architectures and prove useful in specific scenarios.
Describe the solution you'd like We may implement SM75_xxx_LDG copy operations and corresponding traits to address this:
namespace cute {
struct SM75_U32x4_LDG {
using SRegisters = uint128_t[1];
using DRegisters = uint32_t[4];
CUTE_HOST_DEVICE static void
copy(uint128_t const& gmem_src, uint32_t &dst0, uint32_t &dst1, uint32_t &dst2, uint32_t &dst3) {
uint128_t const *gmem_ptr = &gmem_src;
asm volatile("ld.global.L2::128B.v4.b32 {%0, %1, %2, %3}, [%4];\n"
: "=r"(dst0), "=r"(dst1), "=r"(dst2), "=r"(dst3)
: "l"(gmem_ptr));
}
};
template <>
struct Copy_Traits<SM75_U32x4_LDG> {
using ThrID = Layout<_1>;
using SrcLayout = Layout<Shape<_1, Int<sizeof_bits<uint128_t>::value>>>;
using DstLayout = Layout<Shape<_4, Int<sizeof_bits<int32_t>::value>>>;
using RefLayout = SrcLayout;
};
} // namespace cute
Do you consider it necessary to incorporate this feature, and what are your thoughts on the design?
@thakkarV
We have a L2 prefetch API coming soon. It should land in 3.4, let's revisit this after it ships :)
you mean 3.5?(since 3.4 have already released)
Whoops. 3.5. That's right :p
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
This issue has been labeled inactive-90d
due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.