intel-extension-for-pytorch icon indicating copy to clipboard operation
intel-extension-for-pytorch copied to clipboard

initial import time of `torch` & `intel_extension_for_pytorch` is long on Windows Core Ultra

Open Oscilloscope98 opened this issue 4 months ago • 3 comments

Describe the issue

Platform: Intel Core Ultra 7 155H OS: Windows 11 intel-extension-for-pytorch==2.1.10+xpu torch==2.1.0a0+cxx11.abi

On a new MTL machine, the first time for import torch and import intel_extension_for_pytorch took long time:

# first time on this machine, first time in this conda env
torch import: 21.3665828704834 s
ipex import: 30.279845714569092 s

And the 2nd+ importing time will be significantly reduced in the same conda env

# 2nd+ time on this machine, 2nd+ time in this conda env
torch import: 1.3328015804290771 s
ipex import: 1.3857955932617188 s

When I created another conda env on the same machine with exactly the same python version, torch, ipex version, etc., the first import time will still be long (but not so long as the first time on this machine)

# 2nd+ time on this machine, first time in this conda env
torch import: 14.61411190032959 s
ipex import: 21.118491411209106 s

My question

My question is regarding the long duration it takes to import torch & ipex for the first time. What made this happen? Is there any compilation process involved? How can we reduce the initial import time? The answer would be helpful in a real deployment environment. Thank you!

Appendix

Tested conda env

conda create -n test-import python=3.10 libuv
conda activate test-import

pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
pip install torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

Detailed Env

Collecting environment information...
PyTorch version: 2.1.0a0+cxx11.abi
PyTorch CXX11 ABI: No
IPEX version: 2.1.10+xpu
IPEX commit: a12f9f650
Build type: Release

OS: Microsoft Windows 11 家庭中文版
GCC version: N/A
Clang version: N/A
IGC version: N/A
CMake version: N/A
Libc version: N/A

Python version: 3.10.14 | packaged by Anaconda, Inc. | (main, Mar 21 2024, 16:20:14) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22621-SP0
Is XPU available: True
DPCPP runtime version: N/A
MKL version: N/A
GPU models and configuration:
[0] _DeviceProperties(name='Intel(R) Arc(TM) Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu, support_fp64=1, total_memory=12949MB, max_compute_units=128, gpu_eu_count=128)
Intel OpenCL ICD version: N/A
Level Zero version: N/A

CPU:
Architecture=9
CurrentClockSpeed=1400
DeviceID=CPU0
Family=1
L2CacheSize=18432
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=1400
Name=Intel(R) Core(TM) Ultra 7 155H
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] intel-extension-for-pytorch==2.1.10+xpu
[pip3] numpy==1.26.4
[pip3] torch==2.1.0a0+cxx11.abi
[pip3] torchaudio==2.1.0a0+cxx11.abi
[pip3] torchvision==0.16.0a0+cxx11.abi
[conda] intel-extension-for-pytorch 2.1.10+xpu               pypi_0    pypi
[conda] mkl                       2024.0.0                 pypi_0    pypi
[conda] mkl-dpcpp                 2024.0.0                 pypi_0    pypi
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] onemkl-sycl-blas          2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-datafitting   2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-dft           2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-lapack        2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-rng           2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-sparse        2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-stats         2024.0.0                 pypi_0    pypi
[conda] onemkl-sycl-vm            2024.0.0                 pypi_0    pypi
[conda] torch                     2.1.0a0+cxx11.abi          pypi_0    pypi
[conda] torchaudio                2.1.0a0+cxx11.abi          pypi_0    pypi
[conda] torchvision               0.16.0a0+cxx11.abi          pypi_0    pypi

Oscilloscope98 avatar Apr 17 '24 10:04 Oscilloscope98

Hello, thanks for reporting this issue. I will try to reproduce the issue and get back to you.

YuningQiu avatar Apr 17 '24 19:04 YuningQiu

Hi @YuningQiu,

We could provide machine access for reproducing this issue :)

Oscilloscope98 avatar Apr 18 '24 01:04 Oscilloscope98

The IPEX libs are big size, about 2GB totally. The windows load the lib files will take more time. It's depended on windlows and lib size.

After windows load the libs in system buffer, it won't take time load the lib files when run binary.

NeoZhangJianyu avatar Apr 18 '24 08:04 NeoZhangJianyu