Chinese-LLaMA-Alpaca 如何用transformer加载量化后的模型

运行convert_and_quantize_chinese_alpaca_plus之后得到ggml-model-q8_0.bin文件用transformer进行加载结果报错路径中的文件目录如下 pytorch_model.bin 是我把ggml-model-q8_0.bin改的名字请问如何修改可以运行？通过inference_hf脚本运营也是这样，但是在convert的文件中检测是可以正常输出的

May 18 '23 11:05 hrz394943230

量化后的模型没法通过transformers加载，只能用llama.cpp 如果想通过python调用，可以尝试类似llama-cpp-python之类的接口

May 18 '23 14:05 airaria

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

May 25 '23 22:05 github-actions[bot]