[FEA] Specify L2 cache eviction in TMA copy
Which component requires the feature?
CuTe DSL
Feature Request
I'd love to be able to control L2 cache eviction when doing TMA load and TMA store (e.g. evict_first, evict_last)
Additional context This is important for some attention kernels, as we used it in FA3, e.g. here: https://github.com/Dao-AILab/flash-attention/blob/413d07e9deef1e3c793c7de59d7146b43ae4d558/hopper/mainloop_fwd_sm90_tma_gmma_ws.hpp#L753
thanks for reporting this issue, we will add this feature asap.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.