optimum
optimum copied to clipboard
Add InferenceSession options and Provider to ORTModel
ORTModel.load_model was not accepting custom session options, making it hard for example to use a given number of CPU cores with an ORTModel for inference. Indeed, onnxruntime obey neither psutil.cpu_affinity nor taskset.
This may close https://github.com/huggingface/optimum/issues/262
Before submitting
- [x] Did you make sure to update the documentation with your changes?
The documentation is not available anymore as the PR was closed or merged.
Great! In any case, even if CUDAExecutionProvider is solely passed, the execution providers from onnxruntime is still a list, ["CUDAExecutionProvider", "CPUExecutionProvider"], so to me it makes still sense to have a list.