mlc-llm [Bug] mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly

🐛 Bug

To Reproduce

1、 compile model （It is OK！） python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0

2、compile mlc_chat_cli （It is OK！） cd build cmake .. make

3、run mlc_chat_cli （got ERROR here！） ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0

Expected behavior

(mlc-llm) root@Precision-3660:~/mlc-llm# ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0 Use MLC config: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json" Use model weights: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/ndarray-cache.json" Use model library: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/dolly-v2-3b-q3f16_0-cpu.so" You can use the following special commands: /help print the special commands /exit quit the cli /stats print out the latest stats (token/sec) /reset restart a fresh chat /reload [local_id] reload model local_id from disk, or reload the current model if local_id is not specified

Loading model... [20:31:27] ~/mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly Stack trace: [bt] (0) ~/mlc-llm/build/tvm/libtvm_runtime.so(tvm::runtime::Backtraceabi:cxx11+0x2c) [0x7f1448666bdc] [bt] (1) ./build/mlc_chat_cli(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3d) [0x55f18bf3c48d] [bt] (2) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::Conversation::FromTemplate(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)+0x107) [0x7f144895c367] [bt] (3) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::LoadJSONOverride(picojson::value const&, bool)+0x477) [0x7f144897d567] [bt] (4) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::LoadJSONOverride(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool)+0x17c) [0x7f144898688c] [bt] (5) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChat::Reload(tvm::runtime::Module, tvm::runtime::String, tvm::runtime::String)+0xc8e) [0x7f144898764e] [bt] (6) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::LLMChatModule::GetFunction(tvm::runtime::String const&, tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const+0xa32) [0x7f14489895e2] [bt] (7) ./build/mlc_chat_cli(+0x19e44) [0x55f18bf40e44] [bt] (8) ./build/mlc_chat_cli(+0xf1af) [0x55f18bf361af]

Environment

Platform (Intel):
Operating system (Ubuntu.):
Device (PC+RTX 3090, ...)
How you installed MLC-LLM ( source):
How you installed TVM-Unity (pip):
Python version (e.g. 3.11):
GPU driver version (if applicable):
CUDA/cuDNN version (if applicable):
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
Any other relevant information:

Additional context

Jun 07 '23 12:06 felixslu

I have resolve it by advise of this issue below! thanks!

1、cat dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json
2、modify “dolly” to “vicuna_v1.1”

https://github.com/mlc-ai/mlc-llm/issues/257

Jun 07 '23 13:06 felixslu

Hi @felixslu , this is not the correct way to fix it because dolly uses a different conversation template to vicuna... We merged a pull request yesterday #341 which you can pull if you build mlc-llm from source, otherwise you can install the latest mlc-chat-nightly from conda which also incorporates this change.

Jun 07 '23 15:06 yzh119

How is the dolly-v2-3b-q3f16_0-cpu.so generated，please？

Jun 08 '23 12:06 yuwenjun1988

You can follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html

Jun 08 '23 20:06 yzh119

follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html, but not generate so file

Jun 09 '23 02:06 yuwenjun1988

mlc-llm mlc-llm copied to clipboard

[Bug] mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

mlc-llm
mlc-llm copied to clipboard