mlc-llm
mlc-llm copied to clipboard
[Bug] OSError: libmlc_llm_module.so: undefined symbol: _ZN3tvm7runtime9BacktraceB5cxx11Ev
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
python3 -m mlc_chat.rest
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/mnt/f/mlc-llm/python/mlc_chat/__init__.py", line 6, in <module>
from .chat_module import ChatModule
File "/mnt/f/mlc-llm/python/mlc_chat/chat_module.py", line 26, in <module>
_LIB, _LIB_PATH = _load_mlc_llm_lib()
File "/mnt/f/mlc-llm/python/mlc_chat/chat_module.py", line 23, in _load_mlc_llm_lib
return ctypes.CDLL(lib_path[0]), lib_path[0]
File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/libmlc_llm_module.so: undefined symbol: _ZN3tvm7runtime9BacktraceB5cxx11Ev
file /usr/local/lib/libmlc*so
/usr/local/lib/libmlc_llm.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=30608b4b91245e4fbbdac0d122c751e29f19a5ca, with debug_info, not stripped
/usr/local/lib/libmlc_llm_module.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=dc2b2f3176b5790399ade5abae866fe45f73522a, with debug_info, not stripped
Expected behavior
Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): CUDA
- Operating system (e.g. Ubuntu/Windows/MacOS/...): Windows/WSL
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): 3070ti
- How you installed MLC-LLM (
conda
, source): source - How you installed TVM-Unity (
pip
, source): source - Python version (e.g. 3.10):
- GPU driver version (if applicable): 11.8
- CUDA/cuDNN version (if applicable):
- TVM Unity Hash Tag (
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models):
USE_GTEST: AUTO
SUMMARIZE: OFF
USE_IOS_RPC: OFF
CUDA_VERSION: 11.8
USE_LIBBACKTRACE: AUTO
DLPACK_PATH: 3rdparty/dlpack/include
USE_TENSORRT_CODEGEN: OFF
USE_THRUST: OFF
USE_TARGET_ONNX: OFF
USE_AOT_EXECUTOR: ON
BUILD_DUMMY_LIBTVM: OFF
USE_CUDNN: ON
USE_TENSORRT_RUNTIME: OFF
USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF
USE_CCACHE: AUTO
USE_ARM_COMPUTE_LIB: /opt/arm/acl
USE_CPP_RTVM:
USE_OPENCL_GTEST: /path/to/opencl/gtest
USE_MKL: OFF
USE_PT_TVMDSOOP: OFF
USE_CLML: OFF
USE_STACKVM_RUNTIME: OFF
USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF
ROCM_PATH: /opt/rocm
USE_DNNL: OFF
USE_VITIS_AI: OFF
USE_LLVM: llvm-config --ignore-libllvm --link-static
USE_VERILATOR: OFF
USE_TF_TVMDSOOP: OFF
USE_THREADS: ON
USE_MSVC_MT: OFF
BACKTRACE_ON_SEGFAULT: OFF
USE_GRAPH_EXECUTOR: ON
USE_ROCBLAS: OFF
GIT_COMMIT_HASH: 6fd55bcfecc7abcc707339d7a8ba493f0048b613
USE_VULKAN: OFF
USE_RUST_EXT: OFF
USE_CUTLASS: OFF
USE_CPP_RPC: OFF
USE_HEXAGON: OFF
USE_CUSTOM_LOGGING: OFF
USE_UMA: OFF
USE_FALLBACK_STL_MAP: OFF
USE_SORT: ON
USE_RTTI: ON
GIT_COMMIT_TIME: 2023-06-05 12:18:09 -0700
USE_HEXAGON_SDK: /path/to/sdk
USE_BLAS: none
USE_ETHOSN: /opt/arm/ethosn-driver
USE_LIBTORCH: OFF
USE_RANDOM: ON
USE_CUDA: ON
USE_COREML: OFF
USE_AMX: OFF
BUILD_STATIC_RUNTIME: OFF
USE_CMSISNN: ON
USE_KHRONOS_SPIRV: OFF
USE_CLML_GRAPH_EXECUTOR: OFF
USE_TFLITE: OFF
USE_HEXAGON_GTEST: /path/to/hexagon/gtest
PICOJSON_PATH: 3rdparty/picojson
USE_OPENCL_ENABLE_HOST_PTR: OFF
INSTALL_DEV: OFF
USE_PROFILER: ON
USE_NNPACK: OFF
LLVM_VERSION: 10.0.1
USE_OPENCL: OFF
COMPILER_RT_PATH: 3rdparty/compiler-rt
RANG_PATH: 3rdparty/rang/include
USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF
USE_OPENMP: OFF
USE_BNNS: OFF
USE_CUBLAS: ON
USE_METAL: OFF
USE_MICRO_STANDALONE_RUNTIME: ON
USE_HEXAGON_EXTERNAL_LIBS: OFF
USE_ALTERNATIVE_LINKER: AUTO
USE_BYODT_POSIT: OFF
USE_HEXAGON_RPC: OFF
USE_MICRO: ON
DMLC_PATH: 3rdparty/dmlc-core/include
INDEX_DEFAULT_I64: ON
USE_RELAY_DEBUG: OFF
USE_RPC: ON
USE_TENSORFLOW_PATH: none
TVM_CLML_VERSION:
USE_MIOPEN: OFF
USE_ROCM: OFF
USE_PAPI: OFF
USE_CURAND: OFF
TVM_CXX_COMPILER_PATH: /opt/rh/devtoolset-9/root/usr/bin/c++
HIDE_PRIVATE_SYMBOLS: O
- Any other relevant information:
Additional context
python3 -m mlc_chat.rest
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/mnt/f/mlc-llm/python/mlc_chat/__init__.py", line 6, in <module>
from .chat_module import ChatModule
File "/mnt/f/mlc-llm/python/mlc_chat/chat_module.py", line 26, in <module>
_LIB, _LIB_PATH = _load_mlc_llm_lib()
File "/mnt/f/mlc-llm/python/mlc_chat/chat_module.py", line 23, in _load_mlc_llm_lib
return ctypes.CDLL(lib_path[0]), lib_path[0]
File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/libmlc_llm_module.so: undefined symbol: _ZN3tvm7runtime9BacktraceB5cxx11Ev
Interesting. Some symbols being missing might mean there's some misconfiguration of the compilation process. Would you mind sharing how you compiled the shared library?
Interesting. Some symbols being missing might mean there's some misconfiguration of the compilation process. Would you mind sharing how you compiled the shared library?
Sure, like this
git clone [email protected]:mlc-ai/mlc-llm.git --recursive
pip install torch transformers ninja --upgrade
pip install -I mlc_ai_nightly -f https://mlc.ai/wheels --upgrade
pip install --pre mlc-ai-nightly-cu118 -f https://mlc.ai/wheels
python -c "import tvm; tvm.support.describe()"
git lfs install
mkdir -p build
apt install cargo
bash scripts/prep_deps.sh
python3 cmake/gen_cmake_config.py
cp config.cmake build/
cd build/
export TVM_HOME=/mnt/f/mlc-llm/3rdparty/tvm
cmake ..
cd ..
make -j$(nproc)
make install
ldconfig
mlc_chat_cli --model dolly-v2-3b
Use MLC config: "/mnt/f/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json"
Use model weights: "/mnt/f/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/ndarray-cache.json"
Use model library: "/mnt/f/mlc-llm/dist/dolly-v2-3b-q3f16_0/dolly-v2-3b-q3f16_0-cuda.so"
You can use the following special commands:
/help print the special commands
/exit quit the cli
/stats print out the latest stats (token/sec)
/reset restart a fresh chat
/reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified
Loading model...
[11:48:32] /mnt/f/mlc-llm/cpp/llm_chat.cc:885:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (fclear_memory_manager) is false: Cannot find env function vm.builtin.memory_manager.clear
Stack trace:
[bt] (0) /usr/local/lib/libtvm_runtime.so(tvm::runtime::Backtrace[abi:cxx11]()+0x2c) [0x7f69fc1bf46c]
[bt] (1) mlc_chat_cli(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3b) [0x56099d777cbb]
[bt] (2) /usr/local/lib/libmlc_llm.so(+0x1a4fa6) [0x7f69fc4b9fa6]
[bt] (3) /usr/local/lib/libmlc_llm.so(+0x1c0c34) [0x7f69fc4d5c34]
[bt] (4) mlc_chat_cli(+0x15079) [0x56099d77b079]
[bt] (5) mlc_chat_cli(+0xec76) [0x56099d774c76]
[bt] (6) mlc_chat_cli(+0x9a20) [0x56099d76fa20]
[bt] (7) /usr/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f69fbcc3d90]
[bt] (8) /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f69fbcc3e40]
Hi @Ox0400 , would you try running the following commands and paste your outputs?
cd build/
ldd libmlc_llm_module.so
nm tvm/libtvm.so | grep _ZN3tvm7runtime9BacktraceB5cxx11Ev
python -c "import tvm; print(tvm.__file__)"
python -c "import tvm; print(tvm._ffi.base._LIB)"
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
btw, it seems you have installed both mlc_ai_nightly
and mlc-ai-nightly-cu118
, actually you only need one of them, please uninstall mlc_ai_nightly
(this one is prepared for CPU-only use cases) if you are want to use TVM-Unity with cuda support, install both of them may cause confusion.
@yzh119 I fixed this error on local env, Need copy libtvm.so
to /usr/local/lib/libtvm.so
. I uninstalled mlc_ai_nightly
and mlc-ai-nightly-cu118
both and reinstall mlc-ai-nightly-cu118
,then rebuild its works.
But, Then, uninstalled mlc_ai_nightly
and mlc-ai-nightly-cu118
both and reinstall mlc-ai-nightly-cu118
. Raise a new error like like this
python3 -m mlc_chat.rest
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/home/zhipeng/.local/lib/python3.10/site-packages/mlc_chat/__init__.py", line 6, in <module>
from .chat_module import ChatModule
File "/home/zhipeng/.local/lib/python3.10/site-packages/mlc_chat/chat_module.py", line 7, in <module>
import tvm
File "/home/zhipeng/.local/lib/python3.10/site-packages/tvm/__init__.py", line 33, in <module>
from .runtime.object import Object
File "/home/zhipeng/.local/lib/python3.10/site-packages/tvm/runtime/__init__.py", line 22, in <module>
from .object_path import ObjectPath, ObjectPathPair
File "/home/zhipeng/.local/lib/python3.10/site-packages/tvm/runtime/object_path.py", line 44, in <module>
class ObjectPath(Object):
File "/home/zhipeng/.local/lib/python3.10/site-packages/tvm/_ffi/registry.py", line 69, in register
check_call(_LIB.TVMObjectTypeKey2Index(c_str(object_name), ctypes.byref(tidx)))
File "/home/zhipeng/.local/lib/python3.10/site-packages/tvm/_ffi/base.py", line 348, in check_call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (4) /usr/local/lib/libtvm.so(TVMObjectTypeKey2Index+0x73) [0x7f4202298863]
[bt] (3) /usr/local/lib/libtvm.so(tvm::runtime::Object::TypeKey2Index(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xdb) [0x7f42022987db]
[bt] (2) /usr/local/lib/libtvm.so(+0xcc336) [0x7f4202298336]
[bt] (1) /usr/local/lib/libtvm.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3b) [0x7f420224fc5b]
[bt] (0) /usr/local/lib/libtvm.so(tvm::runtime::Backtrace[abi:cxx11]()+0x2c) [0x7f420227f46c]
File "/home/zhipeng/mlc-llm/3rdparty/tvm/src/runtime/object.cc", line 165
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (it != type_key2index_.end()) is false: Cannot find type ObjectPath. Did you forget to register the node by TVM_REGISTER_NODE_TYPE ?
It is interesting that it happens in WSL, @Ox0400 likely you don't want to install libtvm.so to your system as the library is packaged along the python nightly package, try remove that ad try again
@tqchen Thanks you very much.