onnxruntime
onnxruntime copied to clipboard
Create script to export BART encoder and decoder for use with custom beam search op
Is your feature request related to a problem? Please describe. Under https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/transformers/models, a set of tools to enable users to export BART encoder and decoder to ONNX, for use with custom beam search op.
ORT folks contributed an example for this on the transformers repository if it can be useful: https://github.com/huggingface/transformers/tree/main/examples/research_projects/onnx/summarization
@BowenBao, thanks for raising the feature request. we are working on encoder and decoder support in beam search op (currently using T5 as example). After that is ready, we will work on BART integration.
@tianleiwu thanks for following up. I have created a simple initial version for helping Ye with the implementation. Will create a PR afterwards.
PR work in progress. https://github.com/microsoft/onnxruntime/pull/11629
8/11 Update:
- Model has been exported successfully.
- Model has mismatched performance between Pytorch and ONNX
- Validated Encoder and Decoder parts of model, and the results match Pytorch
- Now, Having ONNXRUNTIME team checks beam search op result