warp
warp copied to clipboard
[DOCS] Document tile element writing synchronization overhead caveats
Category
A_kk_tile[row, col] = value
Can cause kernels to take longer than expected if not all threads in the block participate in this operation. Thread synchronization is necessary however, in case downstream work needs to access these values.