nalgebra icon indicating copy to clipboard operation
nalgebra copied to clipboard

gemm_tr, gemm_ad do not use matrixmultiply

Open Andlon opened this issue 5 years ago • 3 comments

It seems that gemm_tr and gemm_ad currently do not leverage matrixmultiply for larger matrices. I noticed when I was profiling after making some changes to some of my performance-sensitive code. In fact, pre-computing let a_t = a.transpose() and calling gemm(1.0, &a_t, &b, 1.0) was significantly faster.

Andlon avatar Apr 13 '20 15:04 Andlon

Hi!

That's right, they don't use matrixmultiply currently. I suppose we could make gemm_tr work with matrixmultiply by adjusting the row and col strides accordingly. The gemm_ad method on the other hand can't use matrixmultiply (except for f32 and f64 matrices in which case this is equivalent to gemm_tr) because it does not support complex numbers.

sebcrozet avatar Apr 14 '20 20:04 sebcrozet

Ah, I see. I had totally forgotten that matrixmultiply does not support complex numbers, and I moreover did not know that it doesn't native support transposition. Thanks for explaining!

Andlon avatar Apr 16 '20 08:04 Andlon

I assume nothing has happened on this issue? I'd be willing to try modifying gemm_tr to use matrixmultiply if nobody else wants to work on it. The performance difference is substantial, and also quite surprising to someone who doesn't know about it. Perhaps a warning in the documentation for the tr_mul and gemm_tr methods would be appropriate until this is fixed?

jbncode avatar Jul 10 '22 19:07 jbncode