Chinese-LLaMA-Alpaca
Chinese-LLaMA-Alpaca copied to clipboard
如何用transformer加载量化后的模型
运行convert_and_quantize_chinese_alpaca_plus之后得到ggml-model-q8_0.bin文件
用transformer进行加载
结果报错
路径中的文件目录如下
pytorch_model.bin 是我把ggml-model-q8_0.bin改的名字
请问如何修改可以运行?
通过inference_hf脚本运营也是这样,但是在convert的文件中检测是可以正常输出的
量化后的模型没法通过transformers加载,只能用llama.cpp 如果想通过python调用,可以尝试类似llama-cpp-python之类的接口
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.