paddlepaddle_backend
paddlepaddle_backend copied to clipboard
Paddle TensorRT配置错误
不适用TensorRT推理,配置文件如下,可以正常推理。
name: "test"
backend: "paddle"
input [
{
name: "input"
data_type: TYPE_FP32
dims: [ 3, 896, 896 ]
}
]
output [
{
name: "conv2d_59.tmp_1"
data_type: TYPE_FP32
dims: [ 3, 896, 896 ]
}
]
instance_group [
{
count: 1
kind: KIND_GPU
}
]
dynamic_batching {
preferred_batch_size: [ 2, 4 ]
max_queue_delay_microseconds: 0
}
配置 TensorRT推理时,启动失败,配置文件和错误如下:
name: "test"
backend: "paddle"
input [
{
name: "input"
data_type: TYPE_FP32
dims: [ 3, 896, 896 ]
}
]
output [ { name: "conv2d_59.tmp_1" data_type: TYPE_FP32 dims: [ 3, 896, 896 ] } ] instance_group [ { count: 1 kind: KIND_GPU } ] dynamic_batching { preferred_batch_size: [ 2, 4 ] max_queue_delay_microseconds: 0 } optimization { execution_accelerators { gpu_execution_accelerator : [ { name : "tensorrt" parameters { key: "precision" value: "trt_fp16" } parameters { key: "min_graph_size" value: "4" } parameters { key: "workspace_size" value: "1073741824" } parameters { key: "enable_tensorrt_oss" value: "0" } parameters { key: "is_dynamic" value: "1" } }, { name : "min_shape" parameters { key: "input" value: "1 3 896 896" } }, { name : "max_shape" parameters { key: "input" value: "2 3 896 896" } }, { name : "opt_shape" parameters { key: "input" value: "1 3 896 896" } } ] } } 错误信息: WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)
测试examples中的ERNIE模型,也是这个错误WARNING: Logging before InitGoogleLogging() is written to STDERR I0306 08:51:06.968530 2126 analysis_config.cc:1336] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape. Segmentation fault (core dumped)
这是什么原因呢 @ZeyuChen @jeng1220 @
@qiu-pinggaizi 你的paddle版本是多少? 用的是文档中的镜像还是自己编译的?
@heliqi 自己编译的,triton镜像版本是nvcr.io/nvidia/tritonserver:22.07-py3;paddle版本是release/2.4
是高版本不支持trt加速吗 @heliqi
支持,应该是新版本接口有改动。 你换2.3试试?
ok