CUDA_gemm
CUDA_gemm copied to clipboard

Published 20 hours ago •

Cjkkkk

→

Metadata

A simple high performance CUDA GEMM, Block Sparse GEMM and Non-uniform Quantized GEMM implementation.

Reame
Issues

Results 3 CUDA_gemm issues

Sort by recently updated

Wrong boundary conditions

I believe you are using the boundary of C for matrices A and B https://github.com/Cjkkkk/CUDA_gemm/blob/14b517370609d322647c55fe9136b6d81c2ba9a7/src/cuda/dense.cu#L107 https://github.com/Cjkkkk/CUDA_gemm/blob/14b517370609d322647c55fe9136b6d81c2ba9a7/src/cuda/dense.cu#L125

gillbam

Added CMakeLists.txt compilation options

Added cmake compilation options to the project

archwine

fixed the boundary condition

See https://github.com/Cjkkkk/CUDA_gemm/issues/6.

ArrogantGao

About

A simple high performance CUDA GEMM, Block Sparse GEMM and Non-uniform Quantized GEMM implementation.

63

Stars

11

Forks

Watchers

Owner

Cjkkkk

← Metadata

63

Stars

11

Forks

Watchers

Owner

Cjkkkk

Metadata

A simple high performance CUDA GEMM, Block Sparse GEMM and Non-uniform Quantized GEMM implementation.

Back

CUDA_gemm CUDA_gemm copied to clipboard

Metadata

Wrong boundary conditions

Added CMakeLists.txt compilation options

fixed the boundary condition

← Metadata

Owner

Metadata

CUDA_gemm
CUDA_gemm copied to clipboard