cydoroga
Results
2
issues of
cydoroga
Hi! I have a Batched Matrix Multiply problem with no fixed stride between batches. The minimalist example is the following (all the matrices are RowMajor): I want to calculate $O...
question
Hi! I'm pretty new to CUTLASS (and CUDA, to be honest). I have a two-fold question: 1) I'm trying to apply dynamic parallelism with the launch of cutlass::gemm::device::Gemm under hood....
question