transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Export bert to onnx failed

Open nonstopfor opened this issue 3 years ago • 2 comments

System Info

  • transformers version: 4.17.0
  • Platform: Linux-4.15.0-167-generic-x86_64-with-debian-buster-sid
  • Python version: 3.7.6
  • Onnx version: 1.12.0
  • PyTorch version (GPU?): 1.10.1+cu111 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: <True>
  • Using distributed or parallel set-up in script?: <No>

Who can help?

No response

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

Code:

import torch
from transformers import AutoModel

device = torch.device('cuda')
model = AutoModel.from_pretrained('bert-base-chinese')

model.to(device)
model.eval()
batch_size = 32
size = (batch_size, 256)
export_onnx_file = 'save/bert.onnx'

input_ids = torch.zeros(size=size, device=device, dtype=torch.long)
attention_mask = torch.ones(size=size, device=device, dtype=torch.float)

token_type_ids = torch.zeros(size=size, device=device, dtype=torch.long)

inputs = (input_ids, attention_mask, token_type_ids)

torch.onnx.export(model=model, args=inputs, f=export_onnx_file, verbose=False, opset_version=12,
                  do_constant_folding=True,
                  output_names = ['last_hidden_state', 'pooler_output'],
                  input_names=["input_ids", "attention_mask", "token_type_ids"])

Expected behavior

Error info:

/home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/onnx/symbolic_helper.py:325: UserWarning: Type cannot be inferred, which might cause exported graph to produce incorrect results.
  warnings.warn("Type cannot be inferred, which might cause exported graph to produce incorrect results.")
[W shape_type_inference.cpp:434] Warning: Constant folding in symbolic shape inference fails: Index is supposed to be an empty tensor or a vector
Exception raised from index_select_out_cuda_impl at /pytorch/aten/src/ATen/native/cuda/Indexing.cu:742 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7ff9245c7d62 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x5f (0x7ff9245c475f in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: void at::native::(anonymous namespace)::index_select_out_cuda_impl<float>(at::Tensor&, at::Tensor const&, long, at::Tensor const&) + 0x190d (0x7ff7a4e601bd in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #3: at::native::index_select_out_cuda(at::Tensor const&, long, at::Tensor const&, at::Tensor&) + 0x3d3 (0x7ff7a4dce0e3 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #4: at::native::index_select_cuda(at::Tensor const&, long, at::Tensor const&) + 0xd0 (0x7ff7a4dce610 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #5: <unknown function> + 0x25756d6 (0x7ff7a5d296d6 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #6: <unknown function> + 0x2575722 (0x7ff7a5d29722 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda_cu.so)
frame #7: at::_ops::index_select::redispatch(c10::DispatchKeySet, at::Tensor const&, long, at::Tensor const&) + 0xb9 (0x7ff7f5617649 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x3253be3 (0x7ff7f6f95be3 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x3254215 (0x7ff7f6f96215 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: at::_ops::index_select::call(at::Tensor const&, long, at::Tensor const&) + 0x166 (0x7ff7f5697296 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: torch::jit::onnx_constant_fold::runTorchBackendForOnnx(torch::jit::Node const*, std::vector<at::Tensor, std::allocator<at::Tensor> >&, int) + 0x1b5f (0x7ff8d8cf023f in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #12: <unknown function> + 0xbcea6a (0x7ff8d8d37a6a in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #13: torch::jit::ONNXShapeTypeInference(torch::jit::Node*, std::map<std::string, c10::IValue, std::less<std::string>, std::allocator<std::pair<std::string const, c10::IValue> > > const&, int) + 0xa8e (0x7ff8d8d3d30e in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #14: <unknown function> + 0xbd5e12 (0x7ff8d8d3ee12 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #15: <unknown function> + 0xb414c0 (0x7ff8d8caa4c0 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #16: <unknown function> + 0x2a5aa8 (0x7ff8d840eaa8 in /home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #45: __libc_start_main + 0xe7 (0x7ff93fb25c87 in /lib/x86_64-linux-gnu/libc.so.6)
 (function ComputeConstantFolding)
Traceback (most recent call last):
  File "onnx_tensorrt.py", line 425, in <module>
    test_bert()
  File "onnx_tensorrt.py", line 316, in test_bert
    input_names=["input_ids", "attention_mask", "token_type_ids"])
  File "/home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/onnx/__init__.py", line 320, in export
    custom_opsets, enable_onnx_checker, use_external_data_format)
  File "/home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 111, in export
    custom_opsets=custom_opsets, use_external_data_format=use_external_data_format)
  File "/home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 729, in _export
    dynamic_axes=dynamic_axes)
  File "/home/zhangzhexin/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 545, in _model_to_graph
    _export_onnx_opset_version)
RuntimeError: Index is supposed to be an empty tensor or a vector

However, if I set the dynamic_axes, there is no problem:

torch.onnx.export(model=model, args=inputs, f=export_onnx_file, verbose=False, opset_version=12,
                      do_constant_folding=True,
                      output_names = ['last_hidden_state', 'pooler_output'],
                      input_names=["input_ids", "attention_mask", "token_type_ids"],
                      dynamic_axes={"input_ids": {0: "batch_size"},
                                    "attention_mask": {0: "batch_size"},
                                    "token_type_ids": {0: "batch_size"},
                      })

Because I need to further convert onnx to tensorrt and my tensorrt version only supports fixed input shape, I don't want to set the dynamic_axes. So how to fix this problem when not setting the dynamic_axes?

nonstopfor avatar Jul 09 '22 10:07 nonstopfor

@nonstopfor, you can change dynamic_axes to fixed shape with onnx python API like the following:

            import onnx
            model = onnx.load("input.onnx")
            for tensor in model.graph.input:
                for dim_proto in tensor.type.tensor_type.shape.dim:
                    if dim_proto.HasField("dim_param"): # and dim_proto.dim_param == 'batch_size':
                        dim_proto.Clear()
                        dim_proto.dim_value = 32   # fixed batch size
            for tensor in model.graph.output:
                for dim_proto in tensor.type.tensor_type.shape.dim:
                    if dim_proto.HasField("dim_param"):
                        dim_proto.Clear()
                        dim_proto.dim_value = 32   # fixed batch size

            onnx.save(model, "output.onnx")

tianleiwu avatar Jul 10 '22 04:07 tianleiwu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Aug 08 '22 15:08 github-actions[bot]

I am facing the same issue converting a custom implementation of DETR (transformer). @nonstopfor were you able to fix this?

hrsht-neur avatar Nov 02 '22 15:11 hrsht-neur