FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Converting t5-3b plus size model to tensorrt

Open chessgecko opened this issue 3 years ago • 4 comments

Description

Hello,

Everything works as expected for t5-large, but t5-3b ooms while building the engine. (using the code in the readme)

The network itself only consumes: 6448MiB / 15109MiB

I tried setting the workspace size really low and freeing the memory from the original plugins, but it didn't seem to help. Any advice would be appreciated.

Reproduced Steps

docker image: 22.05-py3

python ../examples/tensorrt/t5/extractT5ModelToBIN.py # get T5Model weight for test (need Internet)

CUDA_VISIBLE_DEVICES=0 python ../examples/tensorrt/t5/testT5Plugin.py \
        --batch_size 1 \
        --beam_width 4 \
        --max_seq_len 16 \
        --data_type fp16 \
        --sampling_topk 1 \
        --model t5-3b

chessgecko avatar Jun 16 '22 00:06 chessgecko

I see you set data_type to fp32, which requires 12 GB to store the model. In such case, bs 32 + beam width 4 + sequence length 128 may be too large.

byshiue avatar Jun 16 '22 00:06 byshiue

Sorry that was pasted from the readme I did use fp16, updated the issue

chessgecko avatar Jun 16 '22 00:06 chessgecko

This is caused by loading the weights in the plugin constructor. Because TensorRT will clone multiple plugins during building engines, we load the weights multiple times now.

We will fix this bug in next release.

byshiue avatar Jun 16 '22 03:06 byshiue

Appreciate the update.

Also can't thank you enough for this repo, it's already added so much value for us.

chessgecko avatar Jun 16 '22 18:06 chessgecko

Appreciate the update.

Also can't thank you enough for this repo, it's already added so much value for us.

@chessgecko The issue is fixed in latest release. Thank you for the feedback.

byshiue avatar Aug 16 '22 03:08 byshiue

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

byshiue avatar Sep 08 '22 07:09 byshiue