TensorRT-LLM
TensorRT-LLM copied to clipboard
Adding debug options to trtllm-build to visualize the TRT Network before Engine build
Overview
This PR adds 2 new flags to trtllm-build to support debugging.
--visualize-network
dumps the finalized TRT Network as SVG files for visual analysis.
--dry-run
runs through all the steps except the Engine build and serialization which are typically the operations with the most overhead.
When used together, the TRT Network is dumped in ~10 secs for llama2-70B.
Testing
These changes have been manually tested for a few configurations of llama2.
Unit Tests
I can add more unit tests to this PR and fix existing unit tests if this basic design is acceptable.
Hi @Lokiiiiii ,
Sorry for the later response. Thanks for submitting the MR and really appreciate your contributions to TensorRT-LLM. Could you please rebase the MR to the latest main branch?
@QiJune Could you please review this again ?
@QiJune Could you please review this again ?
It LGTM now. We plan to integrate your contributions as part of our refinement work and when the work gets landed into the github, we will add you as the co-author and also acknowledge your efforts.
@QiJune I noticed that this change did not land in TRT-LLM 0.9.0 release tag. Can you provide an ETA ?
@QiJune I noticed that this change did not land in TRT-LLM 0.9.0 release tag. Can you provide an ETA ?
Hi @Lokiiiiii , thanks a lot for your contribution and support! We've merged your changes into the internal codebase, which will be included in the update to the GitHub main branch this week, and land in the next stable release.
Close it since we've merged the changes.