inference
inference copied to clipboard
Fixes #1249
This PR enables one to run bert reference implementation using onnxruntime backend with custom model, dataset and log paths and also supports the usage of Nvidia GPUs with onnxruntime version >= 1.9
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅
Can the new behaviours be documented somewhere please?
The related issue is added here
@rnaidu02 Please merge this
@psyhtest Please review the change. The original comment was confusing as by default all optimizations are enabled in onnxruntime. I'm now disabling the highest optimization level just for aarch64 architecture due to an accuracy issue.
@arjunsuresh It seems all the changes of this PR have been merged. Can we close this PR?
Thank you @pgmpablo157321