cydoroga

Results 2 issues of cydoroga

Hi! I have a Batched Matrix Multiply problem with no fixed stride between batches. The minimalist example is the following (all the matrices are RowMajor): I want to calculate $O...

question

Hi! I'm pretty new to CUTLASS (and CUDA, to be honest). I have a two-fold question: 1) I'm trying to apply dynamic parallelism with the launch of cutlass::gemm::device::Gemm under hood....

question