CUDALibrarySamples icon indicating copy to clipboard operation
CUDALibrarySamples copied to clipboard

cublasLt SYRK example

Open capybara-club opened this issue 2 years ago • 0 comments

A SYRK example in cublasLt would be really useful. i.e. matmul(A, A transpose)

One of the cublasLtMatmulAlgoCapAttributes_t is for uplo support and mentions SYRK. However I don't know how I could guarantee that an algo takes advantage of A and B both being from the same memory space.

Do we pass NULL for B? Do we pass A for B and algo recognizes the pointers are the same? Maybe the optimization for shared memory space isn't much, but the algo is faster because of upper/lower fill only?

This is very clear in the cublasSsyrk call, I would like to know what to do for cublasLt.

Thanks!

capybara-club avatar Dec 17 '23 17:12 capybara-club