intel-extension-for-pytorch
intel-extension-for-pytorch copied to clipboard
Difficult CMake build system producing unexpected results with MKL >= 2021.3.0
I am part of an internal team with Intel and we are using the C++ version of PyTorch for some of our work. We are interested in the CPU version of IPEX to ensure our inferencing is running as fast as possible on modern server CPUs.
We need to build IPEX 2.2.0+cpu from source, and as I began working with the build system, a few observations came to mind.
- It's not possible to call out to CMake directly to just build and install
libintel-ext-pt-cpu.so
.setup.py
creates aversion.h
, which would ideally be done in CMake, which the code in the project needs to compile. - Ideally we should let CMake fully detect Python and set all resulting variables on its own rather than passing
PYTHON_EXECUTABLE
andPYTHON_INCLUDE_DIR
as a configuration option. We already dofind_package(Python COMPONENTS Interpreter Development)
inintel_extension_for_pytorch/csrc/CMakeLists.txt
, which creates all of the variables needed. As long as the virtual environment is added to the CMAKE_PREFIX_PATH, this should work ok? - Doing a
pip install
for MKL packages in CMake as a part ofFindoneMKL.cmake
is unexpected. This would be better put intosetup.py
as a requirement so that this is done automatically by pip. This should be fine because MKL is required to build the project. If not in a virtual environment, which typically wouldn't be needed for building C++ code, this can end up installing the MKL packages into unexpected locations. - With versions of MKL >= 2021.3.0, first-party support for CMake was introduced. When building with these versions, from what I can tell, MKL is pulled in twice. Once as a part of
FindoneMKL.cmake
(statically with 32-bit integers and GNU threading), and once fromFindOMP.cmake
(dynamically with 64-bit integers and Intel threading). It took a great deal of time to figure out what was going on when I found twofind_package
calls that were MKL-related. One wasfind_package(oneMKL)
incsrc/cpu/CMakeLists.txt
andtests/cpu/cpp/CMakeLists.txt
and the other wasfind_package(MKL)
inFindOMP.cmake
. MKL should ideally be pulled in once and in one way only. My builds were failing with the dynamic MKL import when the build logs reported that the static MKL import was successful. - Please consider unifying CMake projects. I would expect when building the C++ side of things with CMake that a single project would also be capable of building tests without needing a separate CMake invocation to configure everything again (that setup.py does). The root
CMakeLists.txt
can only buildcsrc/cpu/CMakeLists.txt
. It can't be used to buildtests/cpu/cpp/CMakeList.txt
, which was unexpected. - Please consider naming the custom find module for OpenMKL (FindOMP.cmake) as FindOpenMP.cmake instead and prepend to the CMAKE_PREFIX_PATH to ensure this is chosen first. This allows IPEX to write standard CMake.
- Please consider using something like CTest to run testing on the C++ side. It has excellent integration with Google Test. Going along with suggestion 5 for project unification, tests can be conditionally enabled just by doing
include(CTest)
and then the user can set-DBUILD_TESTING=ON/OFF
when configuring the project with CMake. Or, this can be anoption
that the project defines a default value for. Defining relative paths back to the root CMake to configure testing properly is not very easy to read or understand (or I imagine maintain). - In
cmake/IPEXConfig.cmake.in
, please consider doing something liketarget_link_libraries(intel-ext-pt-cpu INTERFACE "-Wl,--push-state,--no-as-needed" "${IPEX_CPU_CORE_LIBRARY}" "-Wl,--pop-state")
. This follows what MKL does. It's easier to readtarget_include_directories
andtarget_link_libraries
rather than one massiveset_target_properties
. Additionally, it's more mindful about preserving the state of the linker, being very clear that we're only applying--no-as-needed
to the IPEX library. It also gives a way to have theintel-ext-pt-cpu
target that is available for use in consuming CMake projects to also use thetorch
target that comes fromTorchConfig.cmake
. It can be added astarget_link_library
, like how TorchVision does it withtarget_link_libraries(${PN}::${PN} INTERFACE torch)
.
I'm guessing the complexity here arose from needing to use CMake to build the C++ project, but also needing to build a wheel in Python. I've found that building wheels in Python is best done with Python's own infrastructure. Bridging this gap with CMake is often difficult since there is no standard process for this, so I can understand how things are the way they are. At least for the C++ side of things, the changes listed above would make building this project MUCH easier for us if we could call out to CMake directly. I also ran into some AVX512 FP16 build issues due to invalid conversions from c10::Half
to _Float16
(we are building with GCC 13.2 on modern enough Xeons), but I'll file a separate issue for that along with the patch I have for the fixes.
Have you tried the IPEX CPPSDK? https://intel.github.io/intel-extension-for-pytorch/#installation?platform=cpu&version=v2.2.0%2Bcpu&os=linux%2Fwsl2&package=cppsdk
If you want to compile from source, you can use the command below:
LIBTORCH_PATH=<libtorch_path> python setup.py bdist_cppsdk
To illustrate the difficulty I had here better, here is a snippet of the recipe I'm using to successfully build IPEX for Intel CPUs:
# Source MKL environment variables, which is normal usage for MKL
. $MKL_INSTALL_DIR/mkl/latest/env/vars.sh
export LDFLAGS="$LDFLAGS -Wl,-rpath,${LIBRARY_PATH%:}"
cd $IPEX_SRC_DIR
# Create missing sources first
IPEX_GIT_SHA="$(git rev-parse --short HEAD)"
IPEX_VERSION="$(awk '{print $2}' version.txt | tr '\n' '.' | sed 's/\.$//')"
TORCH_GIT_SHA="$(cat $PYTORCH_INSTALL_DIR/build-hash)"
BUILD_TYPE=Release
# Obtained from the create_version_files function in setup.py
# The C++ sources won't build without this
cat > csrc/utils/version.h <<EOF
#pragma once
#include <string>
namespace torch_ipex {
const std::string __version__()
{ return "${IPEX_VERSION}"; }
const std::string __gitrev__()
{ return "${IPEX_GIT_SHA}"; }
const std::string __torch_gitrev__()
{ return "${TORCH_GIT_SHA}"; }
const std::string __build_type__()
{ return "${BUILD_TYPE}"; }
} // namespace torch_ipex
EOF
# Then apply patches
patch -p1 < $SRC_DIR/0001-Fix-AVX512-FP16-build-issues.patch
patch -p1 < $SRC_DIR/0002-Fix-IPEX-CMake-configuration.patch
patch -p1 < $SRC_DIR/0003-Remove-custom-find-module-for-OpenMP.patch
patch -p1 < $SRC_DIR/0004-Rely-on-MKLROOT-to-find-oneMKL.patch
patch -p1 < $SRC_DIR/0005-Make-oneMKL-link-type-configurable.patch
patch -p1 < $SRC_DIR/0006-Fix-oneMKL-dynamic-library-link-line.patch
mkdir build && cd $_
cmake .. \
-DCMAKE_BUILD_TYPE="${BUILD_TYPE}" \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX="$INSTALL_DIR" \
-DCMAKE_VERBOSE_MAKEFILE=ON \
-DBUILD_MODULE_TYPE=CPU \
-DIPEX_PROJ_NAME=intel_extension_for_pytorch \
-DCMAKE_PROJECT_VERSION="${IPEX_VERSION}" \
-DPYTHON_INCLUDE_DIR="$(python -c 'import sysconfig; print(sysconfig.get_paths()["include"])')" \
-DBUILD_STATIC_ONEMKL=OFF \
cmake --build . --parallel $(nproc)
cmake --install .
This is a lot. Ideally, I wouldn't need to modify the sources for IPEX to just build the C++ project, or do that build work through Python. And, ideally, I wouldn't need to define obscure CMake configuration variables like IPEX_PROJ_NAME
, CMAKE_PROJECT_VERSION
, and PYTHON_INCLUDE_DIR
to get this to work.
Not building the CPPSDK reduces the build time here since we don't need to compress the end result into a package, and then re-extract it just to get it installed to the correct location. I am very glad there was a way to bypass this and reduce to overall build time.
@xuhancn
I also ran into some AVX512 FP16 build issues due to invalid conversions from c10::Half to _Float16 (we are building with GCC 13.2 on modern enough Xeons), but I'll file a separate issue for that along with the patch I have for the fixes.
@gtkramer, not sure what happened with your pull-request, so I created a new one - https://github.com/intel/intel-extension-for-pytorch/pull/587
@xuhancn, I noticed you tried to fix build with Clang previously, could you check my pull-request? Thanks!