CUDA_gemm icon indicating copy to clipboard operation
CUDA_gemm copied to clipboard

A simple high performance CUDA GEMM, Block Sparse GEMM and Non-uniform Quantized GEMM implementation.

Results 3 CUDA_gemm issues
Sort by recently updated
recently updated
newest added

I believe you are using the boundary of C for matrices A and B https://github.com/Cjkkkk/CUDA_gemm/blob/14b517370609d322647c55fe9136b6d81c2ba9a7/src/cuda/dense.cu#L107 https://github.com/Cjkkkk/CUDA_gemm/blob/14b517370609d322647c55fe9136b6d81c2ba9a7/src/cuda/dense.cu#L125

Added cmake compilation options to the project

See https://github.com/Cjkkkk/CUDA_gemm/issues/6.