optimum Add InferenceSession options and Provider to ORTModel

Add InferenceSession options and Provider to ORTModel

Open fxmarty opened this issue 3 years ago • 1 comments

ORTModel.load_model was not accepting custom session options, making it hard for example to use a given number of CPU cores with an ORTModel for inference. Indeed, onnxruntime obey neither psutil.cpu_affinity nor taskset.

This may close https://github.com/huggingface/optimum/issues/262

Before submitting

[x] Did you make sure to update the documentation with your changes?

Jul 08 '22 09:07 fxmarty

The documentation is not available anymore as the PR was closed or merged.

Jul 08 '22 09:07 HuggingFaceDocBuilderDev

Great! In any case, even if CUDAExecutionProvider is solely passed, the execution providers from onnxruntime is still a list, ["CUDAExecutionProvider", "CPUExecutionProvider"], so to me it makes still sense to have a list.

Aug 26 '22 14:08 fxmarty

optimum optimum copied to clipboard

Add InferenceSession options and Provider to ORTModel

Before submitting

optimum
optimum copied to clipboard