mlc-llm [Bug] Run RedPajama model failed on MacOS

🐛 Bug

When I followed the instruction to run the CLI app on my macbook, the RedPajama model is loaded successfully, but caused an error at the stage of running system prompts.

To Reproduce

Steps to reproduce the behavior:

conda install -c mlc-ai -c conda-forge mlc-chat-nightly --force-reinstall
mkdir -p dist/prebuilt
git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
cd dist/prebuilt
git clone https://huggingface.co/mlc-ai/mlc-chat-RedPajama-INCITE-Chat-3B-v1-q4f16_0
cd ../..
mlc_chat_cli --local-id RedPajama-INCITE-Chat-3B-v1-q4f16_0

Then meet the error message as follows:

You can use the following special commands:
  /help               print the special commands
  /exit               quit the cli
  /stats              print out the latest stats (token/sec)
  /reset              restart a fresh chat
  /reload [local_id]  reload model `local_id` from disk, or reload the current model if `local_id` is not specified

Loading model...
[20:13:36] /Users/catalyst/Workspace/mlc-chat-conda-build/tvm/src/runtime/metal/metal_device_api.mm:165: Intializing Metal device 0, name=Apple M1 Pro
Loading finished
Running system prompts...
[20:13:38] /Users/catalyst/Workspace/mlc-chat-conda-build/tvm/include/tvm/runtime/container/array.h:414: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (0 <= i && i < p->size_) is false: IndexError: indexing 0 on an array of size 0
Stack trace:
  [bt] (0) 1   libmlc_llm.dylib                    0x000000010516d510 tvm::runtime::detail::LogFatal::Entry::Finalize() + 68
  [bt] (1) 2   libmlc_llm.dylib                    0x000000010516d4cc tvm::runtime::detail::LogFatal::Entry::Finalize() + 0
  [bt] (2) 3   libmlc_llm.dylib                    0x000000010516c6e8 __clang_call_terminate + 0
  [bt] (3) 4   libmlc_llm.dylib                    0x00000001051874fc tvm::runtime::Array<tvm::runtime::ObjectRef, void>::operator[](long long) const + 280
  [bt] (4) 5   libmlc_llm.dylib                    0x00000001051866c4 mlc::llm::LLMChat::Forward(std::__1::vector<int, std::__1::allocator<int>>, long long) + 1168
  [bt] (5) 6   libmlc_llm.dylib                    0x0000000105188a4c mlc::llm::LLMChat::PrefillStep(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 520
  [bt] (6) 7   libmlc_llm.dylib                    0x00000001051917a4 tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::'lambda13'(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>>::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) + 40
  [bt] (7) 8   mlc_chat_cli                        0x00000001047d2a10 ChatModule::ProcessSystemPrompts() + 204
  [bt] (8) 9   mlc_chat_cli                        0x00000001047d1944 Chat(ChatModule*, std::__1::__fs::filesystem::path const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, int) + 108

Model mlc-chat-vicuna-v1-7b-q3f16_0 and mlc-chat-rwkv-raven-1b5-q8f16_0 can runs correctly at the same env.

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): NA
Operating system (e.g. Ubuntu/Windows/MacOS/...): MacOS 13.4
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): NA
How you installed MLC-LLM (conda, source): conda
How you installed TVM-Unity (pip, source): pip
Python version (e.g. 3.10): 3.11
GPU driver version (if applicable): NA
CUDA/cuDNN version (if applicable): NA

Additional context

ls ./dist/prebuilt/lib

README.md					rwkv-raven-3b-q8f16_0-vulkan.dll
RedPajama-INCITE-Chat-3B-v1-q4f16_0-metal.so	rwkv-raven-3b-q8f16_0-vulkan.so
RedPajama-INCITE-Chat-3B-v1-q4f16_0-vulkan.dll	rwkv-raven-7b-q8f16_0-metal.so
RedPajama-INCITE-Chat-3B-v1-q4f16_0-vulkan.so	rwkv-raven-7b-q8f16_0-metal_x86_64.dylib
RedPajama-INCITE-Chat-3B-v1-q4f16_0-webgpu.wasm	rwkv-raven-7b-q8f16_0-vulkan.dll
RedPajama-INCITE-Chat-3B-v1-q4f32_0-webgpu.wasm	rwkv-raven-7b-q8f16_0-vulkan.so
mlc-chat.apk					tvmjs_runtime_wasi.js
rwkv-raven-1b5-q8f16_0-metal.so			vicuna-v1-7b-q3f16_0-metal.so
rwkv-raven-1b5-q8f16_0-metal_x86_64.dylib	vicuna-v1-7b-q3f16_0-metal_x86_64.dylib
rwkv-raven-1b5-q8f16_0-vulkan.dll		vicuna-v1-7b-q3f16_0-vulkan.dll
rwkv-raven-1b5-q8f16_0-vulkan.so		vicuna-v1-7b-q3f16_0-vulkan.so
rwkv-raven-3b-q8f16_0-metal.so			vicuna-v1-7b-q4f32_0-webgpu.wasm
rwkv-raven-3b-q8f16_0-metal_x86_64.dylib

Jun 04 '23 12:06 ghostsun89

Hi , have you solved this ?

Jun 05 '23 02:06 sleepwalker2017

Could you try to uninstall and reinstall mlc-chat-nightly?

Jun 05 '23 04:06 Hzfengsy

Hi , have you solved this ?

@sleepwalker2017 I haven't solved it.

Jun 05 '23 07:06 ghostsun89

Could you try to uninstall and reinstall mlc-chat-nightly?

Hi @Hzfengsy , I tried to reinstall the mlc-chat-nightly, but it still caused the error.

My reinstall cmd is conda install -c mlc-ai -c conda-forge mlc-chat-nightly --force-reinstall and the mlc-chat-nightly info is as follows:

# packages in environment at /Users/xm/miniconda3/envs/mlc-chat:
#
# Name                    Version                   Build  Channel
libcxx                    16.0.5               h4653b0c_0    conda-forge
mlc-chat-nightly          0.1.dev139      139_g0933e95_h1234567_0    mlc-ai

Jun 05 '23 07:06 ghostsun89

Thanks for your bug reporting. It's just fixed in #316. You can have try after tomorrow's nightly build

Jun 05 '23 08:06 Hzfengsy

@ghostsun89 does it work with the latest update?

Jun 06 '23 12:06 junrushao

@ghostsun89 does it work with the latest update?

It works now. :)

Jun 08 '23 02:06 ghostsun89

mlc-llm mlc-llm copied to clipboard

[Bug] Run RedPajama model failed on MacOS

🐛 Bug

To Reproduce

Environment

Additional context

mlc-llm
mlc-llm copied to clipboard