Paddle3D
Paddle3D copied to clipboard
SMOKE python部署问题(trt加速推理)
os:ubuntu 18.04 paddle: 2.4.0 python 3.6 cuda 10.2 cudnn 7.6.5 tensorrt 7.0.0.11在python部署的执行TRT部署的步骤中,加载--dynamic_shape_file指定的模型动态shape信息,使用FP32精度进行预测,加上关键字--use_gpu --use_trt后,报错(部分): I1212 15:26:12.890170 18103 tensorrt_subgraph_pass.cc:244] --- detect a sub-graph with 4 nodes I1212 15:26:12.895742 18103 tensorrt_subgraph_pass.cc:560] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I1212 15:26:12.901537 18103 engine.cc:199] Run Paddle-TRT Dynamic Shape mode. E1212 15:26:12.901681 18103 helper.h:114] Parameter check failed at: optimizationProfile.cpp::setDimensions::119, condition: std::all_of(dims.d, dims.d + dims.nbDims, [](int x) { return x > 0; })
其中加粗部分错误重复上百行,模型下载的readme中的预训练model, 下列为最后几行的错误: I1212 15:26:24.284873 18103 engine.cc:684] Inspector needs TensorRT version 8.2 and after. I1212 15:26:24.286538 18103 tensorrt_subgraph_pass.cc:244] --- detect a sub-graph with 64 nodes I1212 15:26:24.306820 18103 tensorrt_subgraph_pass.cc:560] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I1212 15:26:24.312911 18103 op_converter.h:160] There is no OpConverter for type gather_nd, now use generic_plugin_creater! W1212 15:26:24.313999 18103 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.8, Runtime API Version: 10.2 W1212 15:26:24.358321 18103 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. I1212 15:26:24.358857 18103 op_converter.h:160] There is no OpConverter for type gather_nd, now use generic_plugin_creater! I1212 15:26:24.359781 18103 op_converter.h:160] There is no OpConverter for type gather_nd, now use generic_plugin_creater! I1212 15:26:24.361145 18103 op_converter.h:160] There is no OpConverter for type gather_nd, now use generic_plugin_creater! E1212 15:26:24.362473 18103 helper.h:114] (Unnamed Layer* 114) [Shuffle]: input and output volume mismatch. input volume is 50 and output volume is 500
未改动源代码,同时readme中所有的trt部署指令都未加--use_gpu, 实测是必须加上的(不然显存不变,为cpu运行),同时smoke中的python trt部署中--collect_shape_info关键字应为--collect_dynamic_shape_info,期待修改