export_llama_to_onnx
export_llama_to_onnx copied to clipboard
转换qwen模型的时候,提示atten_mask:5 error.
odeling_qwen.py:1297: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
self._seq_len_cached = max(2 * seqlen, 16)
C:\Users\ML.cache\huggingface\modules\transformers_modules\Qwen\Qwen-7B-Chat\218aa3240fd5a5d1e80bb6c47d5d774361913706\modeling_qwen.py:482: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if key_size > self.seq_length and self.use_logn_attn and not self.training:
C:\Users\ML.cache\huggingface\modules\transformers_modules\Qwen\Qwen-7B-Chat\218aa3240fd5a5d1e80bb6c47d5d774361913706\modeling_qwen.py:502: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if query.size(1) == key_size:
In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
Traceback (most recent call last):
File "c:\work\AI\directml\export_llama_to_onnx\export_qwen_naive.py", line 173, in
遇到相同的转换错误
已解决,
将 attention_mask = torch.ones([batch, sumN], dtype=torch.int64).to(args.device)
改为
attention_mask = torch.ones([batch, sumN], dtype=torch.bool).to(args.device)
同时使用如下命令运行脚本,
python3 export_qwen_naive.py -m qwen -o qwen.onnx -d cpu -p float32