marie-ai icon indicating copy to clipboard operation
marie-ai copied to clipboard

Exporting and Inference on ONNX models

Open gregbugaj opened this issue 2 years ago • 0 comments

To improve model performance during CPU inference we can convert the models for ONNX and then use if onnxruntime is available during inference time.

Following script check_onnx_runtime.py can be used to test the performance of the models.

Inference time Results

2400x2400 on Resnet50 model PyTorch 3.6160961884500464 VS ONNX 2.131322395749976 image

1200x1200 on Resnet50 model PyTorch 0.8162189463499999 VS ONNX 0.35815778665000836 image

512x512 on Resnet50 model
PyTorch 0.12735954449999554 VS ONNX 0.08733407934996648 image

This is a good implementation that we can base our work on.

  • https://github.com/ultralytics/yolov5/blob/master/export.py

  • https://pytorch.org/tutorials/beginner/deploy_seq2seq_hybrid_frontend_tutorial.html

  • https://facilecode.com/speed-pytorch-vs-onnx/

  • https://cloudblogs.microsoft.com/opensource/2022/04/19/scaling-up-pytorch-inference-serving-billions-of-daily-nlp-inferences-with-onnx-runtime/

  • https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/transformers/notebooks/PyTorch_Bert-Squad_OnnxRuntime_GPU.ipynb

gregbugaj avatar Mar 21 '22 16:03 gregbugaj