ChatGLM-6B [BUG/Help] M1 Pro 16G 提问一直卡住

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

您好，我这边M1 Pro 16G内存，运行python cli_demo.py的时候，内存能占到14G，swap也没有一直往上飙，GPU最高到30%，问一个你好，回答可能要七八分钟。参考了https://github.com/THUDM/ChatGLM-6B/issues/462 重新装了一遍环境，效果还是一样

Expected Behavior

No response

Steps To Reproduce

询问一个你好，时间超过7/8分钟，辛苦大佬帮忙看看

Environment

- OS:mac os 13
- Python: 3.9
- Transformers:4.26.0
- PyTorch:1.12.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :false

Anything else?

No response

Apr 26 '23 02:04 TreasureJade

貌似不行，我也一直在等。

Apr 26 '23 07:04 zhaopengme

m1 使用纯 cpu 进行推理，我前几天在办公环境的测试机上实验，纯 cpu 推理，问一个你好，要推理半个多小时，同时使用了大约20G的硬盘作为计算缓存，如果真要个人部署使用，建议使用N卡且显存高一点，官方建议是1060以上，我推荐8G显存以上

Apr 26 '23 09:04 ZXiangQAQ

m1 使用纯 cpu 进行推理，我前几天在办公环境的测试机上实验，纯 cpu 推理，问一个你好，要推理半个多小时，同时使用了大约20G的硬盘作为计算缓存，如果真要个人部署使用，建议使用N卡且显存高一点，官方建议是1060以上，我推荐8G显存以上

emm，挂了mps gpu加速好像没生效

Apr 28 '23 06:04 TreasureJade

我运行之后页面可以打开，输入个你好之后，直接蹦了；日志打印Failed to infer result type(s) ，python直接意外退出是啥情况？mac m1 pro 16g 内存

May 08 '23 07:05 Liu-Shihao

same here

May 09 '23 09:05 zhaozhiming

我运行之后页面可以打开，输入个你好之后，直接蹦了；日志打印Failed to infer result type(s) ，python直接意外退出是啥情况？mac m1 pro 16g 内存

同问我也是同样问题不知道什么原因

May 15 '23 13:05 AdJIa

把这三个文件的”model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()“改为 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float()

Jun 14 '23 07:06 cat-fishing

我运行之后页面可以打开，输入个你好之后，直接蹦了；日志打印Failed to infer result type(s) ，python直接意外退出是啥情况？mac m1 pro 16g 内存

一模一样

Jun 16 '23 10:06 CaptainKenPan

ChatGLM-6B ChatGLM-6B copied to clipboard

[BUG/Help] M1 Pro 16G 提问一直卡住

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard