TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

[examples/bert/build.py]: Load weights for BertModel and RobertaModel if `--model_dir` is provided

Open tkhanipov opened this issue 1 year ago • 4 comments

Currently the example TensorRT LLM engine builder for Bert models simply ignores model weights if those are present in the model directory, it only reads the config.json file, making it essentially impossible to generate a working engine from a pretrained model. This change fixes that.

tkhanipov avatar Sep 03 '24 12:09 tkhanipov

Created a bug: #2197

tkhanipov avatar Sep 05 '24 14:09 tkhanipov

@symphonylyh Could you please take a look at this PR? Thanks~

lfr-0531 avatar Sep 08 '24 10:09 lfr-0531

Hi @tkhanipov , thanks for the PR! We're currently doing a refinement of the BERT workflow, will address this problem and merge your PR 👍

symphonylyh avatar Sep 27 '24 06:09 symphonylyh

Hi @symphonylyh! Thank you for the response. Just curious: isn't this refinement to the BERT workflow you are talking about related to supporting BERT in Executor API? So far, AFAIU (please correct me if I am wrong), encoder only models are not supported there.

tkhanipov avatar Oct 08 '24 09:10 tkhanipov

Hi @tkhanipov , thanks for the PR! We're currently doing a refinement of the BERT workflow, will address this problem and merge your PR 👍

You did quite an extensive refinement and my PR is now obsolete, thus closing it. Thank you!

tkhanipov avatar Apr 13 '25 10:04 tkhanipov

@tkhanipov also want to point out we now have BERT supported in the so-called Pytorch workflow (that doesn't have the idea of building an TRT engine but still benefit from the optimized kernels, with a more pythonic interface). You can try it out https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/pytorch#supported-models

symphonylyh avatar Apr 13 '25 19:04 symphonylyh