ChatGLM2-6B [ONNX] support export onnx format

There is no any onnx export solution for deployment.

Such repo is under pytorch development while some inference infrastructure prefers onnx format.

No response

Jul 04 '23 11:07 alanzhai219

Found some useful resources: https://tpumlir.org/en/2023/07/10/chatglm2-6b-jie-xi-yu-tpu-bu-shu.html

Good luck

Aug 03 '23 10:08 NewJerseyStyle

When exporting a model from PyTorch to ONNX using float16 precision, there is a significant difference in the output of the following operator.

` // Attention heads [sq, b, h] --> [sq, b, (np * 3 * hn)]

mixed_x_layer = self.query_key_value(hidden_states) ` torch Version: 2.1.0a0+b5021ba onnx Version: 1.14.0 onnxruntime-gpu Version: 1.15.1 opset=17

Aug 21 '23 12:08 XiaokunDing

any updates on this issue? I still can't export the chatglm2 onnx model.

Dec 11 '23 20:12 manishghop