[BUG/Help] <使用默认自动下载的模型运行，逻辑理解存在问题>

Open bxjxxyy opened this issue 2 years ago • 1 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

使用streamlit 运行的web_demo2 使用了quantize(4)加载模型因为显存不够，代码改为了 model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).quantize(4).cuda()

86eced38077b0d1f35cfd4ceb9ab004 7525fadec5593301ed5486bfc9fccfc aee4bc96d59998e259e7d03bb8f94ee

Expected Behavior

No response

Steps To Reproduce

项目默认配置以及自动下载的模型，除了加载模型部分改为了model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).quantize(4).cuda()

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Jun 29 '23 08:06 bxjxxyy

+1, which version used in your case, follow the readme step, the effect of the glm2 model is very poor, perhaps it's not stable.

Jun 29 '23 15:06 warjiang