mlc-llm icon indicating copy to clipboard operation
mlc-llm copied to clipboard

[Bug] mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly

Open felixslu opened this issue 1 year ago • 5 comments

🐛 Bug

To Reproduce

1、 compile model (It is OK!) python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0

2、compile mlc_chat_cli (It is OK!) cd build cmake .. make

3、run mlc_chat_cli (got ERROR here!) ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0

Expected behavior

(mlc-llm) root@Precision-3660:~/mlc-llm# ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0 Use MLC config: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json" Use model weights: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/ndarray-cache.json" Use model library: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/dolly-v2-3b-q3f16_0-cpu.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model local_id from disk, or reload the current model if local_id is not specified

Loading model... [20:31:27] ~/mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly Stack trace: [bt] (0) ~/mlc-llm/build/tvm/libtvm_runtime.so(tvm::runtime::Backtraceabi:cxx11+0x2c) [0x7f1448666bdc] [bt] (1) ./build/mlc_chat_cli(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3d) [0x55f18bf3c48d] [bt] (2) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::Conversation::FromTemplate(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)+0x107) [0x7f144895c367] [bt] (3) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::LoadJSONOverride(picojson::value const&, bool)+0x477) [0x7f144897d567] [bt] (4) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::LoadJSONOverride(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool)+0x17c) [0x7f144898688c] [bt] (5) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::Reload(tvm::runtime::Module, tvm::runtime::String, tvm::runtime::String)+0xc8e) [0x7f144898764e] [bt] (6) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const+0xa32) [0x7f14489895e2] [bt] (7) ./build/mlc_chat_cli(+0x19e44) [0x55f18bf40e44] [bt] (8) ./build/mlc_chat_cli(+0xf1af) [0x55f18bf361af]

Environment

  • Platform (Intel):
  • Operating system (Ubuntu.):
  • Device (PC+RTX 3090, ...)
  • How you installed MLC-LLM ( source):
  • How you installed TVM-Unity (pip):
  • Python version (e.g. 3.11):
  • GPU driver version (if applicable):
  • CUDA/cuDNN version (if applicable):
  • TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
  • Any other relevant information:

Additional context

image

felixslu avatar Jun 07 '23 12:06 felixslu

I have resolve it by advise of this issue below! thanks!

1、cat dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json
2、modify “dolly” to “vicuna_v1.1”

https://github.com/mlc-ai/mlc-llm/issues/257

felixslu avatar Jun 07 '23 13:06 felixslu

Hi @felixslu , this is not the correct way to fix it because dolly uses a different conversation template to vicuna... We merged a pull request yesterday #341 which you can pull if you build mlc-llm from source, otherwise you can install the latest mlc-chat-nightly from conda which also incorporates this change.

yzh119 avatar Jun 07 '23 15:06 yzh119

How is the dolly-v2-3b-q3f16_0-cpu.so generated,please?

yuwenjun1988 avatar Jun 08 '23 12:06 yuwenjun1988

You can follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html

yzh119 avatar Jun 08 '23 20:06 yzh119

follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html, but not generate so file

yuwenjun1988 avatar Jun 09 '23 02:06 yuwenjun1988