onnxruntime_backend
onnxruntime_backend copied to clipboard
Add option to enable CUDA Graphs in CUDA EP
trafficstars
ONNXRuntime has added support in "preview mode" for CUDA Graphs in the CUDA Execution Provider. It would be useful to expose this option for the onnx runtime Triton backend as well to help reduce CPU Utilization.