tblis
tblis copied to clipboard
Complex Tensor Contractions are unusually slow
Hi Devin, Thanks for this nice library!
I'm currently testing the library, and I found the following: For real tensors, tblis is substantially faster than a contraction using dgemm for the cases I tested. For double matrices, dgemm and tblis are about equal. For a test case of 1000 matrix multiplications of two 200 x 200 matrices, a timing gives (using OMP_NUM_THREADS=4 on a 4 core machine)
TBLIS: fastest of 1000 contractions: 281 microseconds TBLIS: average time per contraction: 390.587 microseconds DGEMM: fastest of 1000 contractions: 311 microseconds DGEMM: average time per contraction: 332.135 microseconds
For complex
TBLIS: fastest of 1000 contractions: 14917 microseconds TBLIS: average time per contraction: 15622.9 microseconds ZGEMM: fastest of 1000 contractions: 932 microseconds ZGEMM: average time per contraction: 1053.49 microseconds
Is this to be expected, or am I doing something wrong? I was building with g++ 5 o and comparing to zgemm implementation from openblas.
thanks!
Below, mat_mat_prod is a wrapper around zgemm;
#include
using std::cout; using std::endl; using std::vector;
int main(int argc, char** argv){
int d1=200,d2=200,d3=200;
vector < tblis::label_type > la={1,2};
vector < tblis::label_type > lb={2,3};
vector < tblis::label_type > lc={1,3};
tblis::tensor < Complex> A({d1,d2});
tblis::tensor < Complex> B({d2,d3});
tblis::tensor < Complex > C({d1,d3});
Complex beta=0.0;
Complex alpha=1.0;
auto mA=new Complex [d1d2];
auto mB=new Complex [d2d3];
auto mC=new Complex [d1*d3];
int Nmax=1000;
vector
delete[] mA; delete[] mB; delete[] mC; return 0; }
The problem is that TBLIS doesn't currently include optimized complex microkernels for most architectures. Basically you are getting a slightly fancy triple loop for these cases.
The approach that I had planned to take is the "1m" method, where the real microkernels are reused by cleverly packing the real and imaginary parts of the complex operands. This hasn't been implemented yet, but perhaps soon.