Suyog Gupta

Results 4 issues of Suyog Gupta

1. This MR enables the integration of TRTLLM-bench with AutoDeploy. 2. Adds a feature to AutoDeploy inference optimizer to inflate the kv-caches to the available GPU memory. This helps improve...

Add `auto_deploy` namespace to uniquely identify all the custom ops defined in auto_deploy/custom_ops. This could avoid potential namespace conflicts for ops defined in the manual workflow.

AutoDeploy

## Description 1. Custom ops that wrap `nvtx.start_range` and `nvtx.end_range` markers 2. An annotation pass that inserts the markers in the graph Example: markers inserted for all ops in the...

AutoDeploy

## Summary by CodeRabbit ## Release Notes * **Chores** * Added new compile-stage transform configuration option (disabled by default) to expand optimization capabilities while maintaining backward compatibility.