mlc-llm
mlc-llm copied to clipboard
[Bug] mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly
🐛 Bug
To Reproduce
1、 compile model (It is OK!) python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0
2、compile mlc_chat_cli (It is OK!) cd build cmake .. make
3、run mlc_chat_cli (got ERROR here!) ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0
Expected behavior
(mlc-llm) root@Precision-3660:~/mlc-llm# ./build/mlc_chat_cli --device-name cpu --local-id dolly-v2-3b-q3f16_0
Use MLC config: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json"
Use model weights: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/params/ndarray-cache.json"
Use model library: "~/mlc-llm/dist/dolly-v2-3b-q3f16_0/dolly-v2-3b-q3f16_0-cpu.so"
You can use the following special commands:
/help print the special commands
/exit quit the cli
/stats print out the latest stats (token/sec)
/reset restart a fresh chat
/reload [local_id] reload model local_id
from disk, or reload the current model if local_id
is not specified
Loading model...
[20:31:27] ~/mlc-llm/cpp/conv_templates.cc:156: Unknown conversation template: dolly
Stack trace:
[bt] (0) ~/mlc-llm/build/tvm/libtvm_runtime.so(tvm::runtime::Backtraceabi:cxx11+0x2c) [0x7f1448666bdc]
[bt] (1) ./build/mlc_chat_cli(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x3d) [0x55f18bf3c48d]
[bt] (2) ~/mlc-llm/build/libmlc_llm.so(mlc::llm::Conversation::FromTemplate(std::__cxx11::basic_string<char, std::char_traits
Environment
- Platform (Intel):
- Operating system (Ubuntu.):
- Device (PC+RTX 3090, ...)
- How you installed MLC-LLM ( source):
- How you installed TVM-Unity (
pip
): - Python version (e.g. 3.11):
- GPU driver version (if applicable):
- CUDA/cuDNN version (if applicable):
- TVM Unity Hash Tag (
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models): - Any other relevant information:
Additional context
I have resolve it by advise of this issue below! thanks!
1、cat dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json
2、modify “dolly” to “vicuna_v1.1”
https://github.com/mlc-ai/mlc-llm/issues/257
Hi @felixslu , this is not the correct way to fix it because dolly uses a different conversation template to vicuna... We merged a pull request yesterday #341 which you can pull if you build mlc-llm from source, otherwise you can install the latest mlc-chat-nightly from conda which also incorporates this change.
How is the dolly-v2-3b-q3f16_0-cpu.so generated,please?
You can follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html
follow this tutorial: https://mlc.ai/mlc-llm/docs/tutorials/compile-models.html, but not generate so file