mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[Bug] CMake Error at 3rdparty/tokenizers-cpp/msgpack/CMakeLists.txt during CMake iOS

Open KingSlayer06 opened this issue 6 months ago β€’ 2 comments

πŸ› Bug

Followed the guidelines at https://llm.mlc.ai/docs/deploy/ios.html Getting error when running

cd mlc_llm/ios/MLCChat
mlc_llm package

Verified that all prerequisites are correctly installed

  1. Installed CMake (cmake 4.0.1)
  2. Installed Git & Git-LFS
  3. Installed Rust & Cargo (rustc 1.86.0)
  4. Installed MLC LLM Python Package
  5. Installed TVM Unity Compiler

To Reproduce

Steps to reproduce the behavior: Follow the guidelines mentioned at https://llm.mlc.ai/docs/deploy/ios.html#id4 up till the following command

cd mlc_llm/ios/MLCChat
mlc_llm package

OUTPUT:

[2025-04-15 10:32:14] INFO package.py:327: MLC LLM HOME: "/Users/himanshu/Developer/mlc-llm"
[2025-04-15 10:32:14] INFO package.py:28: Clean up all directories under "dist/bundle"
[2025-04-15 10:32:14] INFO jit.py:43: MLC_JIT_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:14] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_1-MLC
[2025-04-15 10:32:14] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:14] INFO download_cache.py:166: Weights already downloaded: /Users/himanshu/.cache/mlc_llm/model_weights/hf/mlc-ai/Llama-3.2-3B-Instruct-q4f16_1-MLC
[2025-04-15 10:32:14] INFO package.py:81: Model lib is not specified for model "Llama-3.2-3B-Instruct-q4f16_1-MLC". Now jit compile the model library.
[2025-04-15 10:32:14] INFO jit.py:158: Using cached model lib: /Users/himanshu/.cache/mlc_llm/model_lib/40b0ed3db35b57bdf9e3e17e508b46cb.tar
[2025-04-15 10:32:14] INFO package.py:129: Bundle weight for Llama-3.2-3B-Instruct-q4f16_1-MLC, copy into dist/bundle/Llama-3.2-3B-Instruct-q4f16_1-MLC
[2025-04-15 10:32:17] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/gemma-2-2b-it-q4f16_1-MLC
[2025-04-15 10:32:17] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:17] INFO download_cache.py:166: Weights already downloaded: /Users/himanshu/.cache/mlc_llm/model_weights/hf/mlc-ai/gemma-2-2b-it-q4f16_1-MLC
[2025-04-15 10:32:17] INFO package.py:81: Model lib is not specified for model "gemma-2-2b-q4f16_1-MLC". Now jit compile the model library.
[2025-04-15 10:32:17] INFO jit.py:158: Using cached model lib: /Users/himanshu/.cache/mlc_llm/model_lib/c5dec24cc6c962f73d3aacfb7748dd24.tar
[2025-04-15 10:32:17] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC
[2025-04-15 10:32:17] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:17] INFO download_cache.py:166: Weights already downloaded: /Users/himanshu/.cache/mlc_llm/model_weights/hf/mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC
[2025-04-15 10:32:17] INFO package.py:81: Model lib is not specified for model "Phi-3.5-mini-instruct-q4f16_1-MLC". Now jit compile the model library.
[2025-04-15 10:32:17] INFO jit.py:158: Using cached model lib: /Users/himanshu/.cache/mlc_llm/model_lib/3ebbcfc948f423c6c3d9962a421c2dec.tar
[2025-04-15 10:32:17] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Qwen2.5-1.5B-Instruct-q4f16_1-MLC
[2025-04-15 10:32:17] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:17] INFO download_cache.py:166: Weights already downloaded: /Users/himanshu/.cache/mlc_llm/model_weights/hf/mlc-ai/Qwen2.5-1.5B-Instruct-q4f16_1-MLC
[2025-04-15 10:32:17] INFO package.py:81: Model lib is not specified for model "Qwen2.5-1.5B-Instruct-q4f16_1-MLC". Now jit compile the model library.
[2025-04-15 10:32:17] INFO jit.py:158: Using cached model lib: /Users/himanshu/.cache/mlc_llm/model_lib/a40d8ff83b78f0fc2c25728659b4f24f.tar
[2025-04-15 10:32:17] INFO download_cache.py:227: Downloading model from HuggingFace: HF://mlc-ai/Mistral-7B-Instruct-v0.3-q3f16_1-MLC
[2025-04-15 10:32:17] INFO download_cache.py:29: MLC_DOWNLOAD_CACHE_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2025-04-15 10:32:17] INFO download_cache.py:166: Weights already downloaded: /Users/himanshu/.cache/mlc_llm/model_weights/hf/mlc-ai/Mistral-7B-Instruct-v0.3-q3f16_1-MLC
[2025-04-15 10:32:17] INFO package.py:81: Model lib is not specified for model "Mistral-7B-Instruct-v0.3-q3f16_1-MLC". Now jit compile the model library.
[2025-04-15 10:32:17] INFO jit.py:158: Using cached model lib: /Users/himanshu/.cache/mlc_llm/model_lib/ce027d7924d2f97d424da76c6c4f190f.tar
[2025-04-15 10:32:17] INFO package.py:154: Dump the app config below to "dist/bundle/mlc-app-config.json":
{
  "model_list": [
    {
      "model_id": "Llama-3.2-3B-Instruct-q4f16_1-MLC",
      "model_lib": "llama_q4f16_1_d44304359a2802d16aa168086928bcad",
      "model_path": "Llama-3.2-3B-Instruct-q4f16_1-MLC",
      "estimated_vram_bytes": 3000000000
    },
    {
      "model_id": "gemma-2-2b-q4f16_1-MLC",
      "model_lib": "gemma2_q4f16_1_779a95d4ef785ea159992d38fac2317f",
      "model_url": "https://huggingface.co/mlc-ai/gemma-2-2b-it-q4f16_1-MLC",
      "estimated_vram_bytes": 3000000000
    },
    {
      "model_id": "Phi-3.5-mini-instruct-q4f16_1-MLC",
      "model_lib": "phi3_q4f16_1_eba3d93dab5930b68f7296c1fd0d29ec",
      "model_url": "https://huggingface.co/mlc-ai/Phi-3.5-mini-instruct-q4f16_1-MLC",
      "estimated_vram_bytes": 3043000000
    },
    {
      "model_id": "Qwen2.5-1.5B-Instruct-q4f16_1-MLC",
      "model_lib": "qwen2_q4f16_1_11da1300cac0945ff40dfee7b8c81b68",
      "model_url": "https://huggingface.co/mlc-ai/Qwen2.5-1.5B-Instruct-q4f16_1-MLC",
      "estimated_vram_bytes": 2960000000
    },
    {
      "model_id": "Mistral-7B-Instruct-v0.3-q3f16_1-MLC",
      "model_lib": "mistral_q3f16_1_d3cffffd3bd4f2a3b690974f8fd8a5a3",
      "model_url": "https://huggingface.co/mlc-ai/Mistral-7B-Instruct-v0.3-q3f16_1-MLC",
      "estimated_vram_bytes": 3316000000
    }
  ]
}
[2025-04-15 10:32:17] INFO package.py:211: Creating lib from ['/Users/himanshu/.cache/mlc_llm/model_lib/40b0ed3db35b57bdf9e3e17e508b46cb.tar', '/Users/himanshu/.cache/mlc_llm/model_lib/c5dec24cc6c962f73d3aacfb7748dd24.tar', '/Users/himanshu/.cache/mlc_llm/model_lib/3ebbcfc948f423c6c3d9962a421c2dec.tar', '/Users/himanshu/.cache/mlc_llm/model_lib/a40d8ff83b78f0fc2c25728659b4f24f.tar', '/Users/himanshu/.cache/mlc_llm/model_lib/ce027d7924d2f97d424da76c6c4f190f.tar']
[2025-04-15 10:32:17] INFO package.py:212: Validating the library dist/lib/libmodel_iphone.a
[2025-04-15 10:32:17] INFO package.py:213: List of available model libs packaged: ['llama_q4f16_1_d44304359a2802d16aa168086928bcad', 'gemma2_q4f16_1_779a95d4ef785ea159992d38fac2317f', 'phi3_q4f16_1_eba3d93dab5930b68f7296c1fd0d29ec', 'qwen2_q4f16_1_11da1300cac0945ff40dfee7b8c81b68', 'mistral_q3f16_1_d3cffffd3bd4f2a3b690974f8fd8a5a3'], if we have '-' in the model_lib string, it will be turned into '_'
[2025-04-15 10:32:17] INFO package.py:256: Validation pass
[2025-04-15 10:32:17] INFO package.py:309: Build iphone binding
+ sysroot=iphoneos
+ type=Release
+ '[' false = true ']'
+ rustup target add aarch64-apple-ios
info: component 'rust-std' for target 'aarch64-apple-ios' is up to date
+ mkdir -p build/
+ cd build/
+ cmake /Users/himanshu/Developer/mlc-llm -DCMAKE_BUILD_TYPE=Release -DCMAKE_SYSTEM_NAME=iOS -DCMAKE_SYSTEM_VERSION=14.0 -DCMAKE_OSX_SYSROOT=iphoneos -DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 -DCMAKE_BUILD_WITH_INSTALL_NAME_DIR=ON -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON -DCMAKE_INSTALL_PREFIX=. -DCMAKE_CXX_FLAGS=-O3 -DMLC_LLM_INSTALL_STATIC_LIB=ON -DUSE_METAL=ON
-- The C compiler identification is AppleClang 17.0.0.17000013
-- The CXX compiler identification is AppleClang 17.0.0.17000013
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Hide private symbols
-- TVM_SOURCE_DIR: /Users/himanshu/Developer/mlc-llm/3rdparty/tvm
-- Hide private symbols...
-- Forbidding undefined symbols in shared library, using -Wl,-undefined,error on platform iOS
-- Didn't find the path to CCACHE, disabling ccache
-- Performing Test SUPPORT_CXX17
-- Performing Test SUPPORT_CXX17 - Success
-- Build with Metal support
-- Build with contrib.random
-- Build with contrib.sort
-- Git found: /opt/homebrew/bin/git
-- Found TVM_GIT_COMMIT_HASH=9c894f78fdef156263ced19eed67e79203ca4a11
-- Found TVM_GIT_COMMIT_TIME=2025-03-18 15:11:04 -0400
-- Building with TVM Map...
-- Build with thread support...
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found Python: /opt/anaconda3/envs/ml_env/bin/python3.11 (found version "3.11.11") found components: Interpreter
-- /Users/himanshu/Developer/mlc-llm/ios/MLCChat/build/tvm
Add Cython build into the default build step
-- Build without FlashInfer
-- system-nameiOS
CMake Error at 3rdparty/tokenizers-cpp/msgpack/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 3.5 has been removed from CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.

  Or, add -DCMAKE_POLICY_VERSION_MINIMUM=3.5 to try configuring anyway.


-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
  File "/opt/anaconda3/envs/ml_env/bin/mlc_llm", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/anaconda3/envs/ml_env/lib/python3.11/site-packages/mlc_llm/__main__.py", line 54, in main
    cli.main(sys.argv[2:])
  File "/opt/anaconda3/envs/ml_env/lib/python3.11/site-packages/mlc_llm/cli/package.py", line 64, in main
    package(
  File "/opt/anaconda3/envs/ml_env/lib/python3.11/site-packages/mlc_llm/interface/package.py", line 363, in package
    build_iphone_binding(mlc_llm_source_dir, output)
  File "/opt/anaconda3/envs/ml_env/lib/python3.11/site-packages/mlc_llm/interface/package.py", line 310, in build_iphone_binding
    subprocess.run(
  File "/opt/anaconda3/envs/ml_env/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bash', PosixPath('/Users/himanshu/Developer/mlc-llm/ios/prepare_libs.sh')]' returned non-zero exit status 1.

Also tried running the following command to make sure prepare_libs.sh is working correctly

bash prepare_libs.sh 

OUTPUT:

+ sysroot=iphoneos
+ type=Release
+ '[' false = true ']'
+ rustup target add aarch64-apple-ios
info: component 'rust-std' for target 'aarch64-apple-ios' is up to date
+ mkdir -p build/
+ cd build/
+ cmake /Users/himanshu/Developer/mlc-llm -DCMAKE_BUILD_TYPE=Release -DCMAKE_SYSTEM_NAME=iOS -DCMAKE_SYSTEM_VERSION=14.0 -DCMAKE_OSX_SYSROOT=iphoneos -DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_OSX_DEPLOYMENT_TARGET=14.0 -DCMAKE_BUILD_WITH_INSTALL_NAME_DIR=ON -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON -DCMAKE_INSTALL_PREFIX=. -DCMAKE_CXX_FLAGS=-O3 -DMLC_LLM_INSTALL_STATIC_LIB=ON -DUSE_METAL=ON
-- The C compiler identification is AppleClang 17.0.0.17000013
-- The CXX compiler identification is AppleClang 17.0.0.17000013
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Hide private symbols
-- TVM_SOURCE_DIR: /Users/himanshu/Developer/mlc-llm/3rdparty/tvm
-- Hide private symbols...
-- Forbidding undefined symbols in shared library, using -Wl,-undefined,error on platform iOS
-- Didn't find the path to CCACHE, disabling ccache
-- Performing Test SUPPORT_CXX17
-- Performing Test SUPPORT_CXX17 - Success
-- Build with Metal support
-- Build with contrib.random
-- Build with contrib.sort
-- Git found: /opt/homebrew/bin/git
-- Found TVM_GIT_COMMIT_HASH=9c894f78fdef156263ced19eed67e79203ca4a11
-- Found TVM_GIT_COMMIT_TIME=2025-03-18 15:11:04 -0400
-- Building with TVM Map...
-- Build with thread support...
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found Python: /opt/anaconda3/envs/ml_env/bin/python3.11 (found version "3.11.11") found components: Interpreter
-- /Users/himanshu/Developer/mlc-llm/ios/build/tvm
Add Cython build into the default build step
-- Build without FlashInfer
-- system-nameiOS
CMake Error at 3rdparty/tokenizers-cpp/msgpack/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED):
  Compatibility with CMake < 3.5 has been removed from CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.

  Or, add -DCMAKE_POLICY_VERSION_MINIMUM=3.5 to try configuring anyway.


-- Configuring incomplete, errors occurred!

FILES GENERATED in build:

mlc-llm\ios\MLCChat\build
β”‚   CMakeCache.txt
β”‚   TVMBuildOptions.txt
β”‚
β”œβ”€β”€β”€CMakeFiles  
β”‚   β”‚   cmake.check_cache
β”‚   β”‚   cmake.verify_globs
β”‚   β”‚   CMakeConfigureLog.yaml
β”‚   β”‚   VerifyGlobs.cmake
β”‚   β”‚
β”‚  β”œβ”€β”€β”€4.0.1
β”‚   β”‚   β”‚   CMakeCCompiler.cmake
β”‚   β”‚   β”‚   CMakeCXXCompiler.cmake
β”‚   β”‚   β”‚   CMakeDetermineCompilerABI_C.bin
β”‚   β”‚   β”‚   CMakeDetermineCompilerABI_CXX.bin
β”‚   β”‚   β”‚   CMakeSystem.cmake
β”‚   β”‚   β”‚
β”‚   β”‚  β”œβ”€β”€β”€CompilerIdC
β”‚   β”‚   β”‚   β”‚   CMakeCCompilerId.c
β”‚   β”‚   β”‚   β”‚   a.out
β”‚   β”‚   β”‚   β”‚
β”‚   β”‚   β”‚  └───tmp
β”‚   β”‚  └───CompilerIdCXX
β”‚   β”‚    β”‚   CMakeCXXCompilerId.cpp
β”‚   β”‚    β”‚   a.out
β”‚   β”‚    β”‚
β”‚   β”‚   └───tmp
β”‚  β”œβ”€β”€β”€CMakeFiles
β”‚   β”‚   └───CMakeTmp
β”‚  β”œβ”€β”€β”€CMakeScratch
β”‚  β”œβ”€β”€β”€CMakeTmp
β”‚  β”œβ”€β”€β”€pkgRedirects
β”œβ”€β”€β”€tokenizers  
|  β”œβ”€β”€β”€CMakeFiles
|  β”œβ”€β”€β”€msgpack
β”‚   β”‚   └───CMakeFiles
β”œβ”€β”€β”€tvm  
β”‚   β”‚   temp_config_file.cmake
β”‚   β”‚   tvmConfig.cmake
|  β”œβ”€β”€β”€CMakeFiles

Expected behavior

run successfully

mlc_llm package

Environment

  • Platform IOS
  • Operating system MacOS Sequoia 15.3.2
  • Device Apple Macbook M2 Pro
  • How you installed MLC-LLM (conda, source): git clone https://github.com/mlc-ai/mlc-llm.git
  • How you installed TVM-Unity (pip, source): python prebuilt package python -m pip install --pre -U -f https://mlc.ai/wheels mlc-ai-nightly-cpu
  • Python version (e.g. 3.10): 3.11.11
  • GPU driver version (if applicable):
  • CUDA/cuDNN version (if applicable):
  • TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
  • Any other relevant information:

Additional context

my CMakeList.txt

cmake_minimum_required(VERSION 3.18)
project(tokenizers_cpp C CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

include(FetchContent)

# update to contain more rust flags
set(TOKENIZERS_CPP_RUST_FLAGS "")
set(TOKENIZERS_CPP_CARGO_TARGET "")

# extra link libraries
set(TOKENIZERS_CPP_LINK_LIBS "")
set(TOKENIZERS_C_LINK_LIBS "")
set(CARGO_EXTRA_ENVS "")
message(STATUS "system-name" ${CMAKE_SYSTEM_NAME})

if (CMAKE_SYSTEM_NAME STREQUAL "Linux")
  list(APPEND TOKENIZERS_C_LINK_LIBS ${CMAKE_DL_LIBS})
elseif (CMAKE_SYSTEM_NAME STREQUAL "Emscripten")
  set(TOKENIZERS_CPP_CARGO_TARGET wasm32-unknown-emscripten)
elseif (CMAKE_SYSTEM_NAME STREQUAL "iOS")
  if (CMAKE_OSX_SYSROOT MATCHES ".*iPhoneSimulator\\.platform.*")
    if(CMAKE_OSX_ARCHITECTURES MATCHES "x86_64")
      set(TOKENIZERS_CPP_CARGO_TARGET x86_64-apple-ios)
    else ()
      set(TOKENIZERS_CPP_CARGO_TARGET aarch64-apple-ios-sim)
    endif ()
  else ()
    set(TOKENIZERS_CPP_CARGO_TARGET aarch64-apple-ios)
  endif ()
  # add extra dependency needed for rust tokenizer in iOS
  find_library(FOUNDATION_LIB Foundation)
  find_library(SECURITY_LIB Security)
  list(APPEND TOKENIZERS_C_LINK_LIBS ${FOUNDATION_LIB} ${SECURITY_LIB})
elseif (CMAKE_SYSTEM_NAME STREQUAL "Darwin")
  if (CMAKE_SYSTEM_PROCESSOR STREQUAL "arm64")
    set(TOKENIZERS_CPP_CARGO_TARGET aarch64-apple-darwin)
  endif()
elseif (CMAKE_SYSTEM_NAME STREQUAL "Android")
  if (ANDROID_ABI STREQUAL "arm64-v8a")
    set(TOKENIZERS_CPP_CARGO_TARGET aarch64-linux-android)
  elseif (ANDROID_ABI STREQUAL "armeabi-v7a")
    set(TOKENIZERS_CPP_CARGO_TARGET armv7-linux-androideabi)
  elseif (ANDROID_ABI STREQUAL "x86_64")
    set(TOKENIZERS_CPP_CARGO_TARGET x86_64-linux-android)
  elseif (ANDROID_ABI STREQUAL "x86")
    set(TOKENIZERS_CPP_CARGO_TARGET i686-linux-android)
  endif()
  set(CARGO_EXTRA_ENVS
    AR_${TOKENIZERS_CPP_CARGO_TARGET}=${ANDROID_TOOLCHAIN_ROOT}/bin/llvm-ar
    CC_${TOKENIZERS_CPP_CARGO_TARGET}=${ANDROID_TOOLCHAIN_ROOT}/bin/${TOKENIZERS_CPP_CARGO_TARGET}${ANDROID_NATIVE_API_LEVEL}-clang
    CXX_${TOKENIZERS_CPP_CARGO_TARGET}=${ANDROID_TOOLCHAIN_ROOT}/bin/${TOKENIZERS_CPP_CARGO_TARGET}${ANDROID_NATIVE_API_LEVEL}-clang++
  )
elseif (CMAKE_SYSTEM_NAME STREQUAL "Windows")
  set(TOKENIZERS_CPP_CARGO_TARGET x86_64-pc-windows-msvc)
endif()

if(WIN32)
  list(APPEND TOKENIZERS_C_LINK_LIBS
    ntdll wsock32 ws2_32 Bcrypt
    iphlpapi userenv psapi
  )
endif()

set(TOKENIZERS_CPP_CARGO_FLAGS "")
set(TOKENIZERS_CPP_CARGO_TARGET_DIR ${CMAKE_CURRENT_BINARY_DIR})
set(TOKENIZERS_CPP_CARGO_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR})

if (NOT TOKENIZERS_CPP_CARGO_TARGET STREQUAL "")
    list(APPEND TOKENIZERS_CPP_CARGO_FLAGS --target ${TOKENIZERS_CPP_CARGO_TARGET})
    set(TOKENIZERS_CPP_CARGO_BINARY_DIR
        "${TOKENIZERS_CPP_CARGO_BINARY_DIR}/${TOKENIZERS_CPP_CARGO_TARGET}")
endif()

if (CMAKE_BUILD_TYPE STREQUAL "Debug")
    set(TOKENIZERS_CPP_CARGO_BINARY_DIR "${TOKENIZERS_CPP_CARGO_BINARY_DIR}/debug")
else ()
    list(APPEND TOKENIZERS_CPP_CARGO_FLAGS --release)
    set(TOKENIZERS_CPP_CARGO_BINARY_DIR "${TOKENIZERS_CPP_CARGO_BINARY_DIR}/release")
endif ()

get_filename_component(TOKENIZERS_CPP_ROOT ${CMAKE_CURRENT_LIST_FILE} DIRECTORY)
set(TOKENIZERS_CPP_CARGO_SOURCE_PATH ${TOKENIZERS_CPP_ROOT}/rust)

option(MSGPACK_USE_BOOST "Use Boost libraried" OFF)
add_subdirectory(msgpack)

option(MLC_ENABLE_SENTENCEPIECE_TOKENIZER "Enable SentencePiece tokenizer" ON)

if(MSVC)
  set(TOKENIZERS_RUST_LIB "${TOKENIZERS_CPP_CARGO_BINARY_DIR}/tokenizers_c.lib")
else()
  set(TOKENIZERS_RUST_LIB "${TOKENIZERS_CPP_CARGO_BINARY_DIR}/libtokenizers_c.a")
endif()
set(TOKENIZERS_CPP_INCLUDE ${TOKENIZERS_CPP_ROOT}/include)

# NOTE: need to use cmake -E env to be portable in win
add_custom_command(
  OUTPUT ${TOKENIZERS_RUST_LIB}
  COMMAND
  ${CMAKE_COMMAND} -E env
  CARGO_TARGET_DIR=${TOKENIZERS_CPP_CARGO_TARGET_DIR}
  ${CARGO_EXTRA_ENVS}
  RUSTFLAGS="${TOKENIZERS_CPP_RUST_FLAGS}"
  cargo build ${TOKENIZERS_CPP_CARGO_FLAGS}
  WORKING_DIRECTORY ${TOKENIZERS_CPP_CARGO_SOURCE_PATH}
  POST_BUILD COMMAND
  ${CMAKE_COMMAND} -E copy
  ${TOKENIZERS_RUST_LIB} "${CMAKE_CURRENT_BINARY_DIR}"
)

set(
  TOKENIZER_CPP_SRCS
  src/sentencepiece_tokenizer.cc
  src/huggingface_tokenizer.cc
  src/rwkv_world_tokenizer.cc
)
add_library(tokenizer_cpp_objs OBJECT ${TOKENIZER_CPP_SRCS})
target_include_directories(tokenizer_cpp_objs PRIVATE sentencepiece/src)
target_include_directories(tokenizer_cpp_objs PRIVATE msgpack/include)
target_include_directories(tokenizer_cpp_objs PUBLIC ${TOKENIZERS_CPP_INCLUDE})
if (MLC_ENABLE_SENTENCEPIECE_TOKENIZER STREQUAL "ON")
  target_compile_definitions(tokenizer_cpp_objs PUBLIC MLC_ENABLE_SENTENCEPIECE_TOKENIZER)
endif ()
target_link_libraries(tokenizer_cpp_objs PRIVATE msgpack-cxx)

# sentencepiece config
option(SPM_ENABLE_SHARED "override sentence piece config" OFF)
option(SPM_ENABLE_TCMALLOC "" OFF)
# provide macro if it does not exist in cmake system
# it is OK to skip those since we do not provide these apps in the ios
# instead just link to the sentencepiece directly
if (CMAKE_SYSTEM_NAME STREQUAL "iOS")
  macro (set_xcode_property TARGET XCODE_PROPERTY XCODE_VALUE)
      set_property (TARGET ${TARGET} PROPERTY
          XCODE_ATTRIBUTE_${XCODE_PROPERTY} ${XCODE_VALUE})
  endmacro (set_xcode_property)
endif()
add_subdirectory(sentencepiece sentencepiece EXCLUDE_FROM_ALL)

add_library(tokenizers_c INTERFACE ${TOKENIZERS_RUST_LIB})
target_link_libraries(tokenizers_c INTERFACE ${TOKENIZERS_RUST_LIB} ${TOKENIZERS_C_LINK_LIBS})

add_library(tokenizers_cpp STATIC $<TARGET_OBJECTS:tokenizer_cpp_objs>)
target_link_libraries(tokenizers_cpp PRIVATE tokenizers_c sentencepiece-static ${TOKENIZERS_CPP_LINK_LIBS})
target_include_directories(tokenizers_cpp PUBLIC ${TOKENIZERS_CPP_INCLUDE})

KingSlayer06 avatar Apr 15 '25 06:04 KingSlayer06