marlin icon indicating copy to clipboard operation
marlin copied to clipboard

questions about slice_col_par

Open Lenan22 opened this issue 10 months ago • 2 comments

` int slice_col_par = (iters * blockIdx.x) / k_tiles;
int slice_col = slice_col_par; // int slice_iters; // number of threadblock tiles in the current slice int slice_count = 0; // total number of active threadblocks in the current slice int slice_idx; // index of threadblock in current slice; numbered bottom to top

if (slice_col_par >= n_tiles) {

` I have some questions about the code above. For example, if there are 108 SMs on the GPU and the calculated iters is 19, with blockIdx.x ranging from 0 to 127, is slice_col_par directly calculated based on iters=19? For instance, when blockIdx.x=5 or others, this thread block might not iterate 19 times.

Lenan22 avatar Apr 07 '24 13:04 Lenan22