NVIDIA_SGEMM_PRACTICE
NVIDIA_SGEMM_PRACTICE copied to clipboard
Results differ from cublas
Dear @wangzyon , when set the size of the matrix to 3, or 9; the mm results are significantly different from cublas. How can we solve it?
Reulsts of kernel 1-5 differ from cublas. I know the reason.
While, result of kenerl 6-7 is tested be same as cubals, however it is different from that of NumPy (in Python). Why?
I generate a 16x16 matrix A, B, and compute it with kernel 7, and numpy.
data:image/s3,"s3://crabby-images/9d61d/9d61dd26afefb2637ecb195f1f89f77326c9de2f" alt="image"
data:image/s3,"s3://crabby-images/57a7b/57a7ba64f35ba70b9fdb54f530c49586e32e2d75" alt="image"
The following figure is the results calculated through kernel 7.
data:image/s3,"s3://crabby-images/14a0e/14a0eeffdc73fa4ed5c5ee9b25aeb4ec0476d313" alt="image"
It is significantly different from NumPy.
data:image/s3,"s3://crabby-images/d5d67/d5d67fafc227b1718277ee86635ccedffd0550a4" alt="image"