pytorch icon indicating copy to clipboard operation
pytorch copied to clipboard

libtorch C++ windows: The specified module could not be found. mkl_vml_def.1.dll

Open bsekura opened this issue 1 year ago • 4 comments

🐛 Describe the bug

Downloaded libtorch C++ windows CPU version:

https://download.pytorch.org/libtorch/cpu/libtorch-win-shared-with-deps-2.3.0%2Bcpu.zip

While attempting to run a program built against libtorch I get:

INTEL MKL ERROR: The specified module could not be found. mkl_vml_def.1.dll.
Intel MKL FATAL ERROR: cannot load mkl_vml_def.1.dll.

It seems the distribution is missing necessary dependencies.

Versions

libtorch version 2.3.0

downloaded via https://download.pytorch.org/libtorch/cpu/libtorch-win-shared-with-deps-2.3.0%2Bcpu.zip

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @vladimir-aubrecht @iremyux @Blackhex @cristianPanaite @jbschlosser

bsekura avatar Apr 26 '24 10:04 bsekura

I was able to solve it myself, so I will provide more information which may help you deal with the issue.

The distribution is missing MKL runtime dlls. It contains mkl_core.1.dll and mkl_intel_thread.1.dll but I couldn't find any information on the website or github about which version of MKL it was built with. By looking at the aforementioned dll metadata, it looks like the file versions is 2021.4.1.0. The easiest way for me was to grab corresponding nuget package of mkl runtime here : https://www.nuget.org/packages/intelmkl.redist.win-x64/2021.4.0.640. I just downloaded the file, unzipped it and put the runtime dlls from runtimes/win-x64/native along with the rest of them. It's not exactly the same version, but it works fine now and with good performance (though I noticed the avx512 dlls were not used).

bsekura avatar Apr 27 '24 00:04 bsekura

This issue is also present in the Cuda 12.1 Windows distribution of 2.3.0. @bsekura suggestion solved the issue for Cuda 12.1 as well.

Downloaded via https://download.pytorch.org/libtorch/cu121/libtorch-win-shared-with-deps-2.3.0%2Bcu121.zip

P1ayer-1 avatar May 02 '24 19:05 P1ayer-1

Thank you so much @bsekura, I was several hours into debugging. This is also an issue in the CUDA 11.8 distribution of 2.3.0 on Windows, both the release and the debug version. I am noticing that the performance of my program is quite a bit slower than it was on previous versions, although I don't really know if this is the cause. I will probably try downgrading to a different version of LibTorch for now.

AidanShipperley avatar May 03 '24 23:05 AidanShipperley

Using an older worked for me. I'm on cpu though but the corresponding cuda version might work too:

https://download.pytorch.org/libtorch/cpu/libtorch-win-shared-with-deps-2.2.2%2Bcpu.zip https://download.pytorch.org/libtorch/cu121/libtorch-win-shared-with-deps-2.3.0%2Bcu121.zip

Just replace the version number for 2.2.2 or older. Now I can matmul two tensors yay! haha

wbf22 avatar May 04 '24 17:05 wbf22

@bsekura this issue should be fixed in 2.4.0. Can you check with that version and let me know if you have issues?

mantaionut avatar Aug 13 '24 11:08 mantaionut

I am closing this issue due to a lack of activity and since this should be fixed from 2.4.0. Please reopen a new issue if this problem persists.

mantaionut avatar Sep 19 '24 05:09 mantaionut