[examples/bert/build.py]: Load weights for BertModel and RobertaModel if `--model_dir` is provided
Currently the example TensorRT LLM engine builder for Bert models simply ignores model weights if those are present in the model directory, it only reads the config.json file, making it essentially impossible to generate a working engine from a pretrained model. This change fixes that.
Created a bug: #2197
@symphonylyh Could you please take a look at this PR? Thanks~
Hi @tkhanipov , thanks for the PR! We're currently doing a refinement of the BERT workflow, will address this problem and merge your PR 👍
Hi @symphonylyh! Thank you for the response. Just curious: isn't this refinement to the BERT workflow you are talking about related to supporting BERT in Executor API? So far, AFAIU (please correct me if I am wrong), encoder only models are not supported there.
Hi @tkhanipov , thanks for the PR! We're currently doing a refinement of the BERT workflow, will address this problem and merge your PR 👍
You did quite an extensive refinement and my PR is now obsolete, thus closing it. Thank you!
@tkhanipov also want to point out we now have BERT supported in the so-called Pytorch workflow (that doesn't have the idea of building an TRT engine but still benefit from the optimized kernels, with a more pythonic interface). You can try it out https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/pytorch#supported-models