marian-dev
marian-dev copied to clipboard
Consider oneDNN instead of MKL for SGEMM
https://github.com/oneapi-src/oneDNN/ aka MKLDNN aka DNNL now has better performance for MT-size matrices: https://github.com/apache/incubator-mxnet/issues/17980 . And it's open source. The same teams write the GEMM for MKL and oneDNN.
Would be worth benchmarking.
Cool, should be easy to check?
Is the expectation of better performance for any arch or AVX512 specific?
I've only bothered to measure AVX512 but we should check. Paging @sidkashyap.
https://github.com/XapaJIaMnu/marian-dev/tree/oneDNN
OneDNN v1.7 improves the performance for older architectures too, including SSE4.1 for int8 https://github.com/oneapi-src/oneDNN/releases/tag/v1.7
We provided the Matrix Multiplications ranks from Marian Inference for oneDNN to be optimized, the latest version includes those optimizations.
I have a branch with oneDNN. (You also need to disable cblas_sgemm_batched, which i forgot to do) https://github.com/XapaJIaMnu/marian-dev/tree/oneDNN
We need banchmarks to show that it's not slow. Unfortunately, there isn't much incentive to switch to oneDNN completely, as we still need MKL (or some sort of BLAS), because of FAISS requiring things like undefined reference to
sorgqr_` @sidkashyap-at-Intel can we get a word to intel people to include some of those basic BLAS routines inside oneDNN?
Hey @XapaJIaMnu, do we have a priority list of functions in MKL that need to be oneDNN? I will work with the oneDNN team to sort that out if possible.
@sidkashyap-at-Intel
../libmarian.a(VectorTransform.cpp.o): In function `(anonymous namespace)::eig(unsigned long, double*, double*, int)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:428: undefined reference to `dsyev_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:433: undefined reference to `dsyev_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::transform_transpose(long, float const*, float*) const':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:291: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `matrix_qr(int, int, float*)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:98: undefined reference to `sgeqrf_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:103: undefined reference to `sgeqrf_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:106: undefined reference to `sorgqr_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::apply_noalloc(long, float const*, float*) const':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:266: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::set_is_orthonormal()':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:317: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::PCAMatrix::prepare_Ab()':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:710: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::PCAMatrix::train(long, float const*)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:559: undefined reference to `ssyrk_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:597: undefined reference to `sgemm_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:518: undefined reference to `ssyrk_'
collect2: error: ld returned 1 exit status
Basically, FAISS dependencies.
Cheers,
Nick
Thank you, this will help in getting the request quantified. Will update on the progress soon.
Had an internal discussion with @vpirogov from the oneDNN team, unfortunately the support for FAISS MKL dependencies cannot be addressed in oneDNN as it is outside the Deep Learning remit that the library focuses on.
Hi, we need the FAISS support internally, but we can make it depend on finding MKL only?
Had an internal discussion with @vpirogov from the oneDNN team, unfortunately the support for FAISS MKL dependencies cannot be addressed in oneDNN as it is outside the Deep Learning remit that the library focuses on.
It's used in k-NN MT (https://arxiv.org/pdf/2010.00710.pdf) as well. I see several of those hash, search based methods in DL these days.
@ykim362 very good point! All DNN with retrieval methods would rely on it.
@ykim362, @emjotde,
oneDNN is focused on deep learning algorithms. oneAPI has specialized data analytics library, oneDAL, that supports kNN and other machine learning algorithms.
MKL is blocking Wikipedia from deploying Marian because it is closed source.