mlc-llm
mlc-llm copied to clipboard
[Bug] Run RedPajama model failed on MacOS
🐛 Bug
When I followed the instruction to run the CLI app on my macbook, the RedPajama model is loaded successfully, but caused an error at the stage of running system prompts.
To Reproduce
Steps to reproduce the behavior:
- conda install -c mlc-ai -c conda-forge mlc-chat-nightly --force-reinstall
- mkdir -p dist/prebuilt
- git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
- cd dist/prebuilt
- git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
- cd ../..
- mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0
Then meet the error message as follows:
You can use the following special commands:
/help print the special commands
/exit quit the cli
/stats print out the latest stats (token/sec)
/reset restart a fresh chat
/reload [local_id] reload model `local_id` from disk, or reload the current model if `local_id` is not specified
Loading model...
[20:13:36] /Users/catalyst/Workspace/mlc-chat-conda-build/tvm/src/runtime/metal/metal_device_api.mm:165: Intializing Metal device 0, name=Apple M1 Pro
Loading finished
Running system prompts...
[20:13:38] /Users/catalyst/Workspace/mlc-chat-conda-build/tvm/include/tvm/runtime/container/array.h:414:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
Check failed: (0 <= i && i < p->size_) is false: IndexError: indexing 0 on an array of size 0
Stack trace:
[bt] (0) 1 libmlc_llm.dylib 0x000000010516d510 tvm::runtime::detail::LogFatal::Entry::Finalize() + 68
[bt] (1) 2 libmlc_llm.dylib 0x000000010516d4cc tvm::runtime::detail::LogFatal::Entry::Finalize() + 0
[bt] (2) 3 libmlc_llm.dylib 0x000000010516c6e8 __clang_call_terminate + 0
[bt] (3) 4 libmlc_llm.dylib 0x00000001051874fc tvm::runtime::Array<tvm::runtime::ObjectRef, void>::operator[](long long) const + 280
[bt] (4) 5 libmlc_llm.dylib 0x00000001051866c4 mlc::llm::LLMChat::Forward(std::__1::vector<int, std::__1::allocator<int>>, long long) + 1168
[bt] (5) 6 libmlc_llm.dylib 0x0000000105188a4c mlc::llm::LLMChat::PrefillStep(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 520
[bt] (6) 7 libmlc_llm.dylib 0x00000001051917a4 tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::'lambda13'(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>>::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) + 40
[bt] (7) 8 mlc_chat_cli 0x00000001047d2a10 ChatModule::ProcessSystemPrompts() + 204
[bt] (8) 9 mlc_chat_cli 0x00000001047d1944 Chat(ChatModule*, std::__1::__fs::filesystem::path const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, int) + 108
Model mlc-chat-vicuna-v1-7b-q3f16_0 and mlc-chat-rwkv-raven-1b5-q8f16_0 can runs correctly at the same env.
Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): NA
- Operating system (e.g. Ubuntu/Windows/MacOS/...): MacOS 13.4
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): NA
- How you installed MLC-LLM (
conda, source): conda - How you installed TVM-Unity (
pip, source): pip - Python version (e.g. 3.10): 3.11
- GPU driver version (if applicable): NA
- CUDA/cuDNN version (if applicable): NA
Additional context
ls ./dist/prebuilt/lib
README.md rwkv-raven-3b-q8f16_0-vulkan.dll
RedPajama-INCITE-Chat-3B-v1-q4f16_0-metal.so rwkv-raven-3b-q8f16_0-vulkan.so
RedPajama-INCITE-Chat-3B-v1-q4f16_0-vulkan.dll rwkv-raven-7b-q8f16_0-metal.so
RedPajama-INCITE-Chat-3B-v1-q4f16_0-vulkan.so rwkv-raven-7b-q8f16_0-metal_x86_64.dylib
RedPajama-INCITE-Chat-3B-v1-q4f16_0-webgpu.wasm rwkv-raven-7b-q8f16_0-vulkan.dll
RedPajama-INCITE-Chat-3B-v1-q4f32_0-webgpu.wasm rwkv-raven-7b-q8f16_0-vulkan.so
mlc-chat.apk tvmjs_runtime_wasi.js
rwkv-raven-1b5-q8f16_0-metal.so vicuna-v1-7b-q3f16_0-metal.so
rwkv-raven-1b5-q8f16_0-metal_x86_64.dylib vicuna-v1-7b-q3f16_0-metal_x86_64.dylib
rwkv-raven-1b5-q8f16_0-vulkan.dll vicuna-v1-7b-q3f16_0-vulkan.dll
rwkv-raven-1b5-q8f16_0-vulkan.so vicuna-v1-7b-q3f16_0-vulkan.so
rwkv-raven-3b-q8f16_0-metal.so vicuna-v1-7b-q4f32_0-webgpu.wasm
rwkv-raven-3b-q8f16_0-metal_x86_64.dylib
Hi , have you solved this ?
Could you try to uninstall and reinstall mlc-chat-nightly?
Hi , have you solved this ?
@sleepwalker2017 I haven't solved it.
Could you try to uninstall and reinstall
mlc-chat-nightly?
Hi @Hzfengsy , I tried to reinstall the mlc-chat-nightly, but it still caused the error.
My reinstall cmd is conda install -c mlc-ai -c conda-forge mlc-chat-nightly --force-reinstall
and the mlc-chat-nightly info is as follows:
# packages in environment at /Users/xm/miniconda3/envs/mlc-chat:
#
# Name Version Build Channel
libcxx 16.0.5 h4653b0c_0 conda-forge
mlc-chat-nightly 0.1.dev139 139_g0933e95_h1234567_0 mlc-ai
Thanks for your bug reporting. It's just fixed in #316. You can have try after tomorrow's nightly build
@ghostsun89 does it work with the latest update?
@ghostsun89 does it work with the latest update?
It works now. :)