tensorrt_backend
tensorrt_backend copied to clipboard
The Triton backend for TensorRT.
- Deprecated binding index-based APIs were replaced with Tensor name-based - Removed support for Implicit batch models - Added support for INT64 data type - Shape output data type was...
Add cuda context sharing support for TensorRT backend to reduce context switching overhead when graphics workload is running in parallel
The changes in the PR support 2 main items: 1. the GPU memory is allocated based on the selected TensorRT profile and not based on the profile that consumes max...
This PR enabled cuda context sharing with CiG streams on Windows Gaming applications
Updating "github/actions" to mitigate issue with conflict `pre-commit` vs `setup-python`