WebGLM 纯cpu运行有问题: "LayerNormKernelImpl" not implemented for 'Half'

WebGLM Initializing...
WebGLM Loaded
[Enter to Exit] >>> hello
[System] Searching ...
[System] Count of available urls:  10
[System] Fetching ...
[System] Count of available fetch results:  3252593
[System] Extracting ...
[System] Count of paragraphs:  212
[System] Filtering ...
Reference [1](https://dictionary.cambridge.org/dictionary/english/hello): Hello is also used to attract someone’s attention:
Reference [2](https://dictionary.cambridge.org/dictionary/english/hello): Hello is also said at the beginning of a telephone conversation.
Reference [3](https://www.merriam-webster.com/dictionary/hello): They welcomed us with a warm hello.  we said our hellos and got right down to business
Reference [4](https://www.bing.com/dict/search?q=Hello%EF%BC%81&mkt=zh-cn): Hello- nice to meet you. Take a lott- I'll be down in a minute.
Reference [5](https://dictionary.cambridge.org/dictionary/english/hello): (Definition of hello from the Cambridge Academic Content Dictionary © Cambridge University Press)
Traceback (most recent call last):
  File "/home/me/WebGLM/cli_demo.py", line 21, in <module>
    for results in webglm.stream_query(question):
  File "/home/me/WebGLM/model/modeling_webglm.py", line 49, in stream_query
    outputs = self.model.generate(**inputs, max_length=1024, eos_token_id = self.tokenizer.eop_token_id, pad_token_id=self.tokenizer.eop_token_id)
  File "/home/me/p/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/transformers/generation/utils.py", line 1522, in generate
    return self.greedy_search(
  File "/home/me/p/lib/python3.10/site-packages/transformers/generation/utils.py", line 2339, in greedy_search
    outputs = self(
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 902, in forward
    model_output = self.glm(input_ids, position_ids, attention_mask, mems=mems, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 783, in forward
    transformer_output = self.transformer(embeddings, position_ids, attention_mask, mems)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 595, in forward
    hidden_states = layer(*args, mem=mem_i)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/me/hug/modules/transformers_modules/THUDM/WebGLM-2B/cffa6bde032c129824aca963836ba7a03c422990/modeling_glm.py", line 417, in forward
    layernorm_output = self.input_layernorm(hidden_states)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1522, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1531, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/home/me/p/lib/python3.10/site-packages/torch/nn/functional.py", line 2548, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

Jul 17 '23 01:07 ZisIsNotZis

这能纯CPU运行？

Jul 17 '23 04:07 fwerkor

用的python cli_app.py ... -d cpu

Jul 17 '23 05:07 ZisIsNotZis

@ZisIsNotZis 你好！CPU运行的报错似乎与pytorch有关，CPU推理目前仍只支持fp32。可以参考"LayerNormKernelImpl" not implemented for 'Half' - CPU (https://github.com/pytorch/pytorch/issues/52291)

Jul 24 '23 08:07 hanyullai

self.model = self.model.half() 把这一行注释掉就可以了

Jan 17 '24 10:01 luofan18

WebGLM WebGLM copied to clipboard

纯cpu运行有问题: "LayerNormKernelImpl" not implemented for 'Half'

WebGLM
WebGLM copied to clipboard