TensorRT-LLM
TensorRT-LLM copied to clipboard
Adding debug options to trtllm-build to visualize the TRT Network before Engine build
Overview
This PR adds 2 new flags to trtllm-build to support debugging.
--visualize-network
dumps the finalized TRT Network as SVG files for visual analysis.
--dry-run
runs through all the steps except the Engine build and serialization which are typically the operations with the most overhead.
When used together, the TRT Network is dumped in ~10 secs for llama2-70B.
Testing
These changes have been manually tested for a few configurations of llama2.
Unit Tests
I can add more unit tests to this PR and fix existing unit tests if this basic design is acceptable.