MiniCPM icon indicating copy to clipboard operation
MiniCPM copied to clipboard

[Bug]: AttributeError: 'LlamaTokenizerWrapper' object has no attribute 'backend_tokenizer'

Open ricar0 opened this issue 1 year ago • 0 comments

Is there an existing issue ? / 是否已有相关的 issue ?

  • [X] I have searched, and there is no existing issue. / 我已经搜索过了,没有相关的 issue。

Describe the bug / 描述这个 bug

当我运行

mlc_chat gen_config --model-type ${MODEL_TYPE} ./dist/models/${MODEL_NAME}-hf/ --quantization $QUANTIZATION --conv-template LM --sliding-window-size 768 -o dist/${MODEL_NAME}/

的时候,出现

[2024-02-29 13:43:47] ERROR gen_config.py:153: Failed with the exception below. Skipping
Traceback (most recent call last):
  File "/data1/wangmy/mlc-MiniCPM/python/mlc_chat/interface/gen_config.py", line 149, in gen_config
    fast_tokenizer.backend_tokenizer.save(str(tokenizer_json_save_dest))
AttributeError: 'LlamaTokenizerWrapper' object has no attribute 'backend_tokenizer'

试了很多版本的transformers,都没有解决

To Reproduce / 如何复现

准备

mkdir -p build && cd build
# generate build configuration
python3 ../cmake/gen_cmake_config.py && cd ..
# build `mlc_chat_cli`
cd build && cmake .. && cmake --build . --parallel $(nproc) && cd ..
# install
cd python && pip install -e . && cd ..

编译

MODEL_NAME=MiniCPM-V
QUANTIZATION=q4f16_1
MODEL_TYPE=minicpm_v
mlc_chat convert_weight --model-type ${MODEL_TYPE} ./dist/models/${MODEL_NAME}-hf/ --quantization $QUANTIZATION -o dist/$MODEL_NAME/
mlc_chat gen_config --model-type ${MODEL_TYPE} ./dist/models/${MODEL_NAME}-hf/ --quantization $QUANTIZATION --conv-template LM --sliding-window-size 768 -o dist/${MODEL_NAME}/

Expected behavior / 期望的结果

No response

Screenshots / 截图

No response

Environment / 环境

- OS: [e.g. Ubuntu 20.04]
- Pytorch: [e.g. torch 2.0.0]
- CUDA: [e.g. CUDA 11.8]
- Device: [e.g. A10, RTX3090]

Additional context / 其他信息

No response

ricar0 avatar Feb 29 '24 05:02 ricar0