OpenNMT-py
OpenNMT-py copied to clipboard
Export model to ONNX
Is it possible to export a model trained in OpenNMT-py to a ONNX-protobuf file? I tried following this Pytorch tutorial, and after reading about some issues with ONNX and RNNs I compiled Pytorch master from source. I now get this error:
ValueError: Auto nesting doesn't know how to process an input object of type onmt.Models.RNNDecoderState. Accepted types: Variables, or lists/tuples of them
I first load a trained model using
fields, model, model_opt = onmt.ModelConstructor.load_test_model(opt, dummy_opt.__dict__)
then create a dummy input using the get_batch_image
function from test_models.py
def get_batch_image(tgt_l=3, bsize=1, h=15, w=17):
test_src = Variable(torch.ones(bsize, 3, h, w)).float()
test_tgt = Variable(torch.ones(tgt_l, bsize, 1)).long()
test_length = torch.ones(bsize).fill_(136).long() # default None value gave error
return test_src, test_tgt, test_length
and run the exporter with
torch.onnx.export(loaded_model, (test_src, test_tgt, test_length), "im2text_kenteken.proto", verbose=True)
Unfortunately I do not think that ONNX support RNNs at all yet :(
Let us know when they get that functionality.
Good news! The release log for pytorch 0.4 states: ONNX Improvements: RNN support
Operater specification on ONNX github
I've not been able to export my trained RNN (translation, with no modifications to the original code). When I ran torch.onnx.export (using the same process as @Fred-Erik ) I got the following error:
TypeError: wrapPyFuncWithSymbolic(): incompatible function arguments. The following argument types are supported:
1. (self: torch._C.Graph, arg0: function, arg1: List[torch::jit::Value], arg2: int, arg3: function) -> iterator
Invoked with: graph(%0 : Long(3, 1, 1)
%1 : Long(3, 1, 1)
%2 : Long(1)
%3 : Float(11256, 500)
%4 : Float(2000, 500)
%5 : Float(2000, 500)
%6 : Float(2000)
%7 : Float(2000)
%8 : Float(2000, 500)
%9 : Float(2000, 500)
%10 : Float(2000)
%11 : Float(2000)
%12 : Float(21163, 500)
%13 : Float(2000, 1000)
%14 : Float(2000, 500)
%15 : Float(2000)
%16 : Float(2000)
%17 : Float(2000, 500)
%18 : Float(2000, 500)
%19 : Float(2000)
%20 : Float(2000)
%21 : Float(500, 500)
%22 : Float(500, 1000)
%23 : Float(21163, 500)
%24 : Float(21163)) {
%26 : Long(2, 1, 1) = aten::slice[dim=0, start=0, end=2, step=1](%1), scope: NMTModel
%25 : Long(2, 1, 1) = aten::as_strided[size=[2, 1, 1], stride=[1, 1, 1], storage_offset=0](%1), scope: NMTModel
%27 : Long(3, 1, 1) = aten::split[split_size=1, dim=2](%0), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
%28 : Long(3, 1) = aten::squeeze[dim=2](%27), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
%29 : Float(3, 1, 500) = aten::embedding[padding_idx=1, scale_grad_by_freq=0, sparse=0](%3, %28), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]/Embedding[0]
%30 : Float(3, 1, 500) = aten::cat[dim=2](%29), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
%31 : Long(1) = aten::view[size=[-1]](%2), scope: NMTModel/RNNEncoder[encoder]
%32 : Long(1) = prim::Constant[value={3}](), scope: NMTModel/RNNEncoder[encoder]
%39 : Float(3, 500), %40 : Long(3), %41 : Handle = ^PackPadded(False)(%30, %32), scope: NMTModel/RNNEncoder[encoder]
%34 : Long() = aten::select[dim=0, index=0](%32), scope: NMTModel/RNNEncoder[encoder]
%33 : Long() = aten::as_strided[size=[], stride=[], storage_offset=0](%32), scope: NMTModel/RNNEncoder[encoder]
%35 : Byte() = aten::le[other={0}](%34), scope: NMTModel/RNNEncoder[encoder]
%36 : Float(3, 1, 500) = aten::alias(%30), scope: NMTModel/RNNEncoder[encoder]
%37 : Float(3, 500) = aten::view[size=[-1, 500]](%36), scope: NMTModel/RNNEncoder[encoder]
%38 : Float(3, 500) = aten::cat[dim=0](%37), scope: NMTModel/RNNEncoder[encoder]
return ();
}
, <function _symbolic_pack_padded_sequence.<locals>.pack_padded_sequence_trace_wrapper at 0x11c5978c8>, [30 defined in (%30 : Float(3, 1, 500) = aten::cat[dim=2](%29), scope: NMTModel/RNNEncoder[encoder]/Embeddings[embeddings]/Sequential[make_embedding]/Elementwise[emb_luts]
), [3]], 2, <function _symbolic_pack_padded_sequence.<locals>._onnx_symbolic_pack_padded_sequence at 0x11c597840>
My guess is that there's something about the shape of an embedding which can't be handled by the compiler out of the box, but I'm not sure where the best place to start debugging this issue is.
see #1023 Onnx export compatibility requires some preliminary work:
- remove "object" like DecoderState which is done by this PR
- change the way attentions flow, currently a dictionary but this does not work, requires list or tuple
- some other changes for unsupported operators
As of now, just to export the transformer encoder, we need pytorch to support following operators: expand_as masked_fill Then we need to change the way attention is returned from the decoder: onnx does not accept a dict, we'll need to use a tuple or simply tensors. to be continued.
@vince62s
i want to export pytorch model to onnx model like this (MT task using transformer,with no modifications to the original code):
def main(opt):
dummy_parser = configargparse.ArgumentParser(description='train.py')
opts.model_opts(dummy_parser)
dummy_opt = dummy_parser.parse_known_args([])[0]
load_test_model = onmt.decoders.ensemble.load_test_model
if len(opt.models) > 1 else onmt.model_builder.load_test_model
fields, model, model_opt = load_test_model(opt, dummy_opt.dict)
m = torch.load("model-transformer_step_120000.pt")
x = Variable(torch.from_numpy(numpy.random((9,1,1))))
y = Variable(torch.random(9,1,1))
torch.onnx._export(model,(x,y,None), "Trans", export_params=True)
after i run the codes ,then i got this: TypeError: 'module' object is not callable
it seems the encoder and decoder are not one and the same.
with the current code it won't work. also if you want to try some preliminary steps, you need to export separately the encoder, the decoder and the generator. but again given some code structure and onnx requirements it won't work with the attention being a tuple.
@vince62s you said change the way attentions flow, currently a dictionary but this does not work, requires list or tuple (https://github.com/OpenNMT/OpenNMT-py/issues/638#issuecomment-434765232), but now you said but again given some code structure and onnx requirements it won't work with the attention being a tuple. i'm a little confused after contrasting the two statements above.
I mixed up my comment sorry.
@vince62s ONNX itself is native to pytorch 1.0 as the model output format, may the output be the format of ONNX ?
it is not as simple as this. there are several levels of complexity:
- we need to use only operations that are onnx compatible. Until recently some operations like expand_as, masked_fill were not exportable.
- there might be some requirements to add in the code to activate the onnx trace. I have not tested this, but it might require some changes.
- the transformer architecture work on a time step basis which would require to still code some stuff so that tensorrt can work with it. (it being the decoder export and the encoder export) There might be some workaround (cf what fairseq did) but I have not tested it.
@vince62s Any update on this?
Maybe this needs to be reviewed again, as several PyTorch (and ONNX) versions were released in the meantime.
If it is still difficult to represent the full graph in ONNX, we could export the encoder and one decoder step separately. Then I will be interested in gluing everything together in CTranslate2. https://github.com/OpenNMT/CTranslate2/issues/5
less and less trendy. Pytorch 2.0 might help to get a compiled and speedy version.