OpenNMT-py icon indicating copy to clipboard operation
OpenNMT-py copied to clipboard

Export model to ONNX

Open Fred-Erik opened this issue 6 years ago • 13 comments

Is it possible to export a model trained in OpenNMT-py to a ONNX-protobuf file? I tried following this Pytorch tutorial, and after reading about some issues with ONNX and RNNs I compiled Pytorch master from source. I now get this error:

ValueError: Auto nesting doesn't know how to process an input object of type onmt.Models.RNNDecoderState. Accepted types: Variables, or lists/tuples of them

I first load a trained model using fields, model, model_opt = onmt.ModelConstructor.load_test_model(opt, dummy_opt.__dict__) then create a dummy input using the get_batch_image function from test_models.py

def get_batch_image(tgt_l=3, bsize=1, h=15, w=17):
        test_src = Variable(torch.ones(bsize, 3, h, w)).float()
        test_tgt = Variable(torch.ones(tgt_l, bsize, 1)).long()
        test_length = torch.ones(bsize).fill_(136).long() # default None value gave error
        return test_src, test_tgt, test_length

and run the exporter with torch.onnx.export(loaded_model, (test_src, test_tgt, test_length), "im2text_kenteken.proto", verbose=True)

Fred-Erik avatar Mar 26 '18 10:03 Fred-Erik

Unfortunately I do not think that ONNX support RNNs at all yet :(

Let us know when they get that functionality.

srush avatar Mar 27 '18 00:03 srush

Good news! The release log for pytorch 0.4 states: ONNX Improvements: RNN support

Operater specification on ONNX github

Fred-Erik avatar May 07 '18 08:05 Fred-Erik

I've not been able to export my trained RNN (translation, with no modifications to the original code). When I ran torch.onnx.export (using the same process as @Fred-Erik ) I got the following error:

TypeError: wrapPyFuncWithSymbolic(): incompatible function arguments. The following argument types are supported:
    1. (self: torch._C.Graph, arg0: function, arg1: List[torch::jit::Value], arg2: int, arg3: function) -> iterator
Invoked with: graph(%0 : Long(3, 1, 1)
      %1 : Long(3, 1, 1)
      %2 : Long(1)
      %3 : Float(11256, 500)
      %4 : Float(2000, 500)
      %5 : Float(2000, 500)
      %6 : Float(2000)
      %7 : Float(2000)
      %8 : Float(2000, 500)
      %9 : Float(2000, 500)
      %10 : Float(2000)
      %11 : Float(2000)
      %12 : Float(21163, 500)
      %13 : Float(2000, 1000)
      %14 : Float(2000, 500)
      %15 : Float(2000)
      %16 : Float(2000)
      %17 : Float(2000, 500)
      %18 : Float(2000, 500)
      %19 : Float(2000)
      %20 : Float(2000)
      %21 : Float(500, 500)
      %22 : Float(500, 1000)
      %23 : Float(21163, 500)
      %24 : Float(21163)) {
  %26 : Long(2, 1, 1) = aten::slice[dim=0, start=0, end=2, step=1](%1), scope: NMTModel
  %25 : Long(2, 1, 1) = aten::as_strided[size=[2, 1, 1], stride=[1, 1, 1], storage_offset=0](%1), scope: NMTModel
  %27 : Long(3, 1, 1) = aten::split[split_size=1, dim=2](%0), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
  %28 : Long(3, 1) = aten::squeeze[dim=2](%27), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
  %29 : Float(3, 1, 500) = aten::embedding[padding_idx=1, scale_grad_by_freq=0, sparse=0](%3, %28), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]/Embedding[0]
  %30 : Float(3, 1, 500) = aten::cat[dim=2](%29), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
  %31 : Long(1) = aten::view[size=[-1]](%2), scope: NMTModel/RNNEncoder[encoder]
  %32 : Long(1) = prim::Constant[value={3}](), scope: NMTModel/RNNEncoder[encoder]
  %39 : Float(3, 500), %40 : Long(3), %41 : Handle = ^PackPadded(False)(%30, %32), scope: NMTModel/RNNEncoder[encoder]
  %34 : Long() = aten::select[dim=0, index=0](%32), scope: NMTModel/RNNEncoder[encoder]
  %33 : Long() = aten::as_strided[size=[], stride=[], storage_offset=0](%32), scope: NMTModel/RNNEncoder[encoder]
  %35 : Byte() = aten::le[other={0}](%34), scope: NMTModel/RNNEncoder[encoder]
  %36 : Float(3, 1, 500) = aten::alias(%30), scope: NMTModel/RNNEncoder[encoder]
  %37 : Float(3, 500) = aten::view[size=[-1, 500]](%36), scope: NMTModel/RNNEncoder[encoder]
  %38 : Float(3, 500) = aten::cat[dim=0](%37), scope: NMTModel/RNNEncoder[encoder]
  return ();
}
, <function _symbolic_pack_padded_sequence.<locals>.pack_padded_sequence_trace_wrapper at 0x11c5978c8>, [30 defined in (%30 : Float(3, 1, 500) = aten::cat[dim=2](%29), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
), [3]], 2, <function _symbolic_pack_padded_sequence.<locals>._onnx_symbolic_pack_padded_sequence at 0x11c597840>

My guess is that there's something about the shape of an embedding which can't be handled by the compiler out of the box, but I'm not sure where the best place to start debugging this issue is.

BrendanJohnson avatar Sep 09 '18 11:09 BrendanJohnson

see #1023 Onnx export compatibility requires some preliminary work:

  • remove "object" like DecoderState which is done by this PR
  • change the way attentions flow, currently a dictionary but this does not work, requires list or tuple
  • some other changes for unsupported operators

vince62s avatar Oct 31 '18 16:10 vince62s

As of now, just to export the transformer encoder, we need pytorch to support following operators: expand_as masked_fill Then we need to change the way attention is returned from the decoder: onnx does not accept a dict, we'll need to use a tuple or simply tensors. to be continued.

vince62s avatar Nov 13 '18 19:11 vince62s

@vince62s i want to export pytorch model to onnx model like this (MT task using transformer,with no modifications to the original code): def main(opt): dummy_parser = configargparse.ArgumentParser(description='train.py') opts.model_opts(dummy_parser) dummy_opt = dummy_parser.parse_known_args([])[0] load_test_model = onmt.decoders.ensemble.load_test_model
if len(opt.models) > 1 else onmt.model_builder.load_test_model fields, model, model_opt = load_test_model(opt, dummy_opt.dict) m = torch.load("model-transformer_step_120000.pt") x = Variable(torch.from_numpy(numpy.random((9,1,1)))) y = Variable(torch.random(9,1,1)) torch.onnx._export(model,(x,y,None), "Trans", export_params=True)

after i run the codes ,then i got this: TypeError: 'module' object is not callable

it seems the encoder and decoder are not one and the same.

520jefferson avatar Feb 01 '19 07:02 520jefferson

with the current code it won't work. also if you want to try some preliminary steps, you need to export separately the encoder, the decoder and the generator. but again given some code structure and onnx requirements it won't work with the attention being a tuple.

vince62s avatar Feb 01 '19 07:02 vince62s

@vince62s you said change the way attentions flow, currently a dictionary but this does not work, requires list or tuple (https://github.com/OpenNMT/OpenNMT-py/issues/638#issuecomment-434765232), but now you said but again given some code structure and onnx requirements it won't work with the attention being a tuple. i'm a little confused after contrasting the two statements above.

520jefferson avatar Feb 01 '19 08:02 520jefferson

I mixed up my comment sorry.

vince62s avatar Feb 01 '19 08:02 vince62s

@vince62s ONNX itself is native to pytorch 1.0 as the model output format, may the output be the format of ONNX ?

520jefferson avatar Feb 01 '19 08:02 520jefferson

it is not as simple as this. there are several levels of complexity:

  1. we need to use only operations that are onnx compatible. Until recently some operations like expand_as, masked_fill were not exportable.
  2. there might be some requirements to add in the code to activate the onnx trace. I have not tested this, but it might require some changes.
  3. the transformer architecture work on a time step basis which would require to still code some stuff so that tensorrt can work with it. (it being the decoder export and the encoder export) There might be some workaround (cf what fairseq did) but I have not tested it.

vince62s avatar Feb 01 '19 10:02 vince62s

@vince62s Any update on this?

kalyangvs avatar Oct 02 '19 05:10 kalyangvs

Maybe this needs to be reviewed again, as several PyTorch (and ONNX) versions were released in the meantime.

If it is still difficult to represent the full graph in ONNX, we could export the encoder and one decoder step separately. Then I will be interested in gluing everything together in CTranslate2. https://github.com/OpenNMT/CTranslate2/issues/5

guillaumekln avatar Jan 30 '20 12:01 guillaumekln

less and less trendy. Pytorch 2.0 might help to get a compiled and speedy version.

vince62s avatar Jan 18 '23 12:01 vince62s