在CPU上运行webui.py报错Tensor on device cpu is not on the expected device meta!
在CPU上运行python webui.py能启动,但最后有:RuntimeError: Tensor on device cpu is not on the expected device meta! web页面也能访问,但提示模型未加载成功!
` loading model config llm device: cpu embedding device: cpu dir: /home/langchain-ChatGLM flagging username: 8596d15f59a642ea85b7c0f7239dedd5
Loading THUDM/chatglm-6b-int4... Warning: self.llm_device is False. This means that no use GPU bring to be load CPU mode
No compiled kernel found. ...
- This IS expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Loaded the model in 7.01 seconds. WARNING 2023-06-14 19:33:32,482-1d: No sentence-transformers model found with name /home/langchain-ChatGLM/text2vec-large-chinese. Creating a new one with MEAN pooling. WARNING 2023-06-14 19:33:36,233-1d: The dtype of attention mask (torch.int64) is not bool Tensor on device cpu is not on the expected device meta! Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch().
`
web页面提示:模型未成功加载,请到页面左上角"模型配置"选项卡中重新选择后点击"加载模型"按钮 每次输入对话内容,发送后就Error,后台就报RuntimeError: Tensor on device cpu is not on the expected device meta!这个错,这是哪里配置有问题呢
int4量化版本的模型不支持在CPU上运行
那应该怎么解决?如何部署到GPU上运行?
那应该怎么解决?如何部署到GPU上运行?
如果你没有动源码,项目会自动判断可用设备,都加载到CPU上,说明你应该是没有GPU吧。
不对,我刚刚测试了一下,chatglm-6bint4是可以加载到cpu上的,只是输出很慢,质量也差,但不会直接报错。你模型的权重下载的对吗?
由于该issue长期不活跃,开发组将其关闭,可以在最新代码上重新尝试。如果有需求可以重新提起