Lenan22
Results
3
issues of
Lenan22
打印一下各个optim_pass的信息,感觉比较便于debug
constexpr int a_sh_rd_delta_o = 2 * ((threads / 32) / (thread_n_blocks / 4)); 1. Does the 32 here refer to a warp? 2. What does 4 here mean? 3. What...
` int slice_col_par = (iters * blockIdx.x) / k_tiles; int slice_col = slice_col_par; // int slice_iters; // number of threadblock tiles in the current slice int slice_count = 0; //...