tensorrt
tensorrt copied to clipboard
converted model pb too large
We tested it following the blog Leveraging TensorFlow-TensorRT integration for Low latency Inference, and got a very large saved model
ENV
- Tensorflow: 2.4.1
- TensorRT: 6.0.1
- Cuda: 10.1
- cudnn: 7.6
size before and after convert
convert a model finetuned with bert
from tensorflow.python.compiler.tensorrt import trt_convert as trt
conversion_params = trt.TrtConversionParams(precision_mode=trt.TrtPrecisionMode.FP32)
input_saved_model_dir = 'xxx'
output_saved_model_dir = 'xxx'
converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir, conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir)
4.0K ./bert_finetune_20210303/assets
9.5M ./bert_finetune_20210303/saved_model.pb
387M ./bert_finetune_20210303/variables
397M ./bert_finetune_20210303
4.0K ./bert_finetune_20210303_fp16/assets
1.1G ./bert_finetune_20210303/saved_model.pb
387M ./bert_finetune_20210303_fp16/variables
1.5G ./bert_finetune_20210303_fp16
4.0K ./bert_finetune_20210303_fp32/assets
1.1G ./bert_finetune_20210303_fp32/saved_model.pb
387M ./bert_finetune_20210303_fp32/variables
1.5G ./bert_finetune_20210303_fp32
Anyone could help me. Thanks a lot