WebGLM
WebGLM copied to clipboard
纯cpu运行有问题: "LayerNormKernelImpl" not implemented for 'Half'
WebGLM Initializing...
WebGLM Loaded
[Enter to Exit] >>> hello
[System] Searching ...
[System] Count of available urls: 10
[System] Fetching ...
[System] Count of available fetch results: 3252593
[System] Extracting ...
[System] Count of paragraphs: 212
[System] Filtering ...
Reference [1](https://dictionary.cambridge.org/dictionary/english/hello): Hello is also used to attract someone’s attention:
Reference [2](https://dictionary.cambridge.org/dictionary/english/hello): Hello is also said at the beginning of a telephone conversation.
Reference [3](https://www.merriam-webster.com/dictionary/hello): They welcomed us with a warm hello. we said our hellos and got right down to business
Reference [4](https://www.bing.com/dict/search?q=Hello%EF%BC%81&mkt=zh-cn): Hello- nice to meet you. Take a lott- I'll be down in a minute.
Reference [5](https://dictionary.cambridge.org/dictionary/english/hello): (Definition of hello from the Cambridge Academic Content Dictionary © Cambridge University Press)
Traceback (most recent call last):
File "/home/me/WebGLM/cli_demo.py", line 21, in <module>
for results in webglm.stream_query(question):
File "/home/me/WebGLM/model/modeling_webglm.py", line 49, in stream_query
outputs = self.model.generate(**inputs, max_length=1024, eos_token_id = self.tokenizer.eop_token_id, pad_token_id=self.tokenizer.eop_token_id)
File "/home/me/p/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/transformers/generation/utils.py", line 1522, in generate
return self.greedy_search(
File "/home/me/p/lib/python3.10/site-packages/transformers/generation/utils.py", line 2339, in greedy_search
outputs = self(
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
return forward_call(*args, **kwargs)
File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 902, in forward
model_output = self.glm(input_ids, position_ids, attention_mask, mems=mems, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
return forward_call(*args, **kwargs)
File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 783, in forward
transformer_output = self.transformer(embeddings, position_ids, attention_mask, mems)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
return forward_call(*args, **kwargs)
File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 595, in forward
hidden_states = layer(*args, mem=mem_i)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
return forward_call(*args, **kwargs)
File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 417, in forward
layernorm_output = self.input_layernorm(hidden_states)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
return forward_call(*args, **kwargs)
File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
return F.layer_norm(
File "/home/me/p/lib/python3.10/site-packages/torch/nn/functional.py", line 2548, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
这能纯CPU运行?
用的python cli_app.py ... -d cpu
@ZisIsNotZis 你好!CPU运行的报错似乎与pytorch有关,CPU推理目前仍只支持fp32。可以参考"LayerNormKernelImpl" not implemented for 'Half' - CPU (https://github.com/pytorch/pytorch/issues/52291)
self.model = self.model.half()
把这一行注释掉就可以了