cutlass icon indicating copy to clipboard operation
cutlass copied to clipboard

CUDA Templates for Linear Algebra Subroutines

Results 608 cutlass issues
Sort by recently updated
recently updated
newest added

As group convolution is one important operator(eg. in ResNext: https://arxiv.org/pdf/1611.05431.pdf) in CV models, is there any plan to support it in future release? Thanks a lot!

feature request

Hi, is there any support for DP Tensor Cores in cutlass, available or foreseen ? Thanks in advance

question
inactive-30d

I'm trying to convert the data type of a tensor on GPU, I think this should be faster than on CPU. Also I'll need to do it several time between...

question
inactive-30d

Hi! I am learning 'tall' matmul and find it **hard to find the code** describing how slice K reduce the value.... I think, each wrap will calculate 32*64 values (each...

question

**Describe the bug** After using `add_subdirectory` on CUTLASS in a CMake project, I'd expect CUTLASS's transitive target(s) to be exported for use in other projects. I have tried the following...

bug

I modified the epilogue function in Example 17 from LinearCombinationRelu to LinearCombinationSilu, like this: using EpilogueOp = cutlass::epilogue::thread::LinearCombinationSilu< ElementOutput, // Data type of output matrix. 128 / cutlass::sizeof_bits::value, // The...

feature request
inactive-30d

Hi, NVIDIA team! Thank you for your awesome work! I am a newbie to this area and I have been trying to learn this framework for months and still made...

documentation
inactive-30d

Hi! I am learning cutlass, and I see something like: (from official post) ```C++ /// CUTLASS SGEMM example __global__ void gemm_kernel(void gemm_kernel( float *C, float *C, float const *A, float...

question
inactive-30d

Hi! I'm pretty new to CUTLASS (and CUDA, to be honest). I have a two-fold question: 1) I'm trying to apply dynamic parallelism with the launch of cutlass::gemm::device::Gemm under hood....

question

Hi! I am learning cutlass. And I read this post: [CUTLASS: Fast Linear Algebra in CUDA C++ | NVIDIA Technical Blog](https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/) But I can not find official “dispatch_policies.h”, only find...

question
inactive-30d