FastDeploy ocr fastdeploy部署gpu服务端报错：failed to load all models

环境

【FastDeploy版本】： docker image: registry.baidubce.com/paddlepaddle/fastdeploy:1.0.4-gpu-cuda11.4-trt8.5-21.10
【系统平台】: Linux x64(Ubuntu 18.04)
【硬件】： Nvidia GPU ， CUDA 12.0 参考https://github.com/PaddlePaddle/FastDeploy/tree/develop/examples/vision/ocr/PP-OCR/serving/fastdeploy_serving

问题日志及出现问题的操作流程

执行fastdeployserver --model-repository=/ocr_serving/models 启动服务时出错，日志如下： Error: Failed to initialize NVML W0406 06:06:25.695839 97 metrics.cc:221] DCGM unable to start: DCGM initialization error W0406 06:06:25.696071 97 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: CUDA driver version is insufficient for CUDA runtime version I0406 06:06:25.696095 97 cuda_memory_manager.cc:115] CUDA memory pool disabled E0406 06:06:25.697896 97 model_repository_manager.cc:1890] Poll failed for model directory 'cls_runtime': instance group cls_runtime_0 of model cls_runtime specifies invalid or unsupported gpu id 0. GPUs with at least the minimum required CUDA compute compatibility of 6.000000 are: E0406 06:06:25.699168 97 model_repository_manager.cc:1890] Poll failed for model directory 'det_runtime': instance group det_runtime_0 of model det_runtime specifies invalid or unsupported gpu id 0. GPUs with at least the minimum required CUDA compute compatibility of 6.000000 are: E0406 06:06:25.700564 97 model_repository_manager.cc:1890] Poll failed for model directory 'rec_runtime': instance group rec_runtime_0 of model rec_runtime specifies invalid or unsupported gpu id 0. GPUs with at least the minimum required CUDA compute compatibility of 6.000000 are: E0406 06:06:25.700604 97 model_repository_manager.cc:1375] Invalid argument: ensemble rec_pp contains models that are not available: rec_runtime E0406 06:06:25.700613 97 model_repository_manager.cc:1375] Invalid argument: ensemble pp_ocr contains models that are not available: det_runtime E0406 06:06:25.700619 97 model_repository_manager.cc:1375] Invalid argument: ensemble cls_pp contains models that are not available: cls_runtime I0406 06:06:25.700740 97 model_repository_manager.cc:1022] loading: cls_postprocess:1 I0406 06:06:25.802928 97 model_repository_manager.cc:1022] loading: det_postprocess:1 I0406 06:06:25.812457 97 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: cls_postprocess_0 (CPU device 0) I0406 06:06:25.903565 97 model_repository_manager.cc:1022] loading: det_preprocess:1 I0406 06:06:26.004031 97 model_repository_manager.cc:1022] loading: rec_postprocess:1 model_config: {'name': 'cls_postprocess', 'platform': '', 'backend': 'python', 'version_policy': {'latest': {'num_versions': 1}}, 'max_batch_size': 128, 'input': [{'name': 'POST_INPUT_0', 'data_type': 'TYPE_FP32', 'format': 'FORMAT_NONE', 'dims': [2], 'is_shape_tensor': False, 'allow_ragged_batch': False}], 'output': [{'name': 'POST_OUTPUT_0', 'data_type': 'TYPE_INT32', 'dims': [1], 'label_filename': '', 'is_shape_tensor': False}, {'name': 'POST_OUTPUT_1', 'data_type': 'TYPE_FP32', 'dims': [1], 'label_filename': '', 'is_shape_tensor': False}], 'batch_input': [], 'batch_output': [], 'optimization': {'priority': 'PRIORITY_DEFAULT', 'input_pinned_memory': {'enable': True}, 'output_pinned_memory': {'enable': True}, 'gather_kernel_buffer_threshold': 0, 'eager_batching': False}, 'instance_group': [{'name': 'cls_postprocess_0', 'kind': 'KIND_CPU', 'count': 1, 'gpus': [], 'secondary_devices': [], 'profile': [], 'passive': False, 'host_policy': ''}], 'default_model_filename': '', 'cc_model_filenames': {}, 'metric_tags': {}, 'parameters': {}, 'model_warmup': []} postprocess input names: ['POST_INPUT_0'] postprocess output names: ['POST_OUTPUT_0', 'POST_OUTPUT_1'] I0406 06:06:26.082822 97 model_repository_manager.cc:1183] successfully loaded 'cls_postprocess' version 1 I0406 06:06:26.083348 97 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: det_postprocess_0 (CPU device 0) model_config: {'name': 'det_postprocess', 'platform': '', 'backend': 'python', 'version_policy': {'latest': {'num_versions': 1}}, 'max_batch_size': 128, 'input': [{'name': 'POST_INPUT_0', 'data_type': 'TYPE_FP32', 'format': 'FORMAT_NONE', 'dims': [1, -1, -1], 'is_shape_tensor': False, 'allow_ragged_batch': False}, {'name': 'POST_INPUT_1', 'data_type': 'TYPE_INT32', 'format': 'FORMAT_NONE', 'dims': [4], 'is_shape_tensor': False, 'allow_ragged_batch': False}, {'name': 'ORI_IMG', 'data_type': 'TYPE_UINT8', 'format': 'FORMAT_NONE', 'dims': [-1, -1, 3], 'is_shape_tensor': False, 'allow_ragged_batch': False}], 'output': [{'name': 'POST_OUTPUT_0', 'data_type': 'TYPE_STRING', 'dims': [-1, 1], 'label_filename': '', 'is_shape_tensor': False}, {'name': 'POST_OUTPUT_1', 'data_type': 'TYPE_FP32', 'dims': [-1, 1], 'label_filename': '', 'is_shape_tensor': False}, {'name': 'POST_OUTPUT_2', 'data_type': 'TYPE_FP32', 'dims': [-1, -1, 1], 'label_filename': '', 'is_shape_tensor': False}], 'batch_input': [], 'batch_output': [], 'optimization': {'priority': 'PRIORITY_DEFAULT', 'input_pinned_memory': {'enable': True}, 'output_pinned_memory': {'enable': True}, 'gather_kernel_buffer_threshold': 0, 'eager_batching': False}, 'instance_group': [{'name': 'det_postprocess_0', 'kind': 'KIND_CPU', 'count': 1, 'gpus': [], 'secondary_devices': [], 'profile': [], 'passive': False, 'host_policy': ''}], 'default_model_filename': '', 'cc_model_filenames': {}, 'metric_tags': {}, 'parameters': {}, 'model_warmup': []} postprocess input names: ['POST_INPUT_0', 'POST_INPUT_1', 'ORI_IMG'] postprocess output names: ['POST_OUTPUT_0', 'POST_OUTPUT_1', 'POST_OUTPUT_2'] I0406 06:06:26.329773 97 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: det_preprocess_0 (CPU device 0) I0406 06:06:26.329891 97 model_repository_manager.cc:1183] successfully loaded 'det_postprocess' version 1 model_config: {'name': 'det_preprocess', 'platform': '', 'backend': 'python', 'version_policy': {'latest': {'num_versions': 1}}, 'max_batch_size': 1, 'input': [{'name': 'INPUT_0', 'data_type': 'TYPE_UINT8', 'format': 'FORMAT_NONE', 'dims': [-1, -1, 3], 'is_shape_tensor': False, 'allow_ragged_batch': False}], 'output': [{'name': 'OUTPUT_0', 'data_type': 'TYPE_FP32', 'dims': [3, -1, -1], 'label_filename': '', 'is_shape_tensor': False}, {'name': 'OUTPUT_1', 'data_type': 'TYPE_INT32', 'dims': [4], 'label_filename': '', 'is_shape_tensor': False}], 'batch_input': [], 'batch_output': [], 'optimization': {'priority': 'PRIORITY_DEFAULT', 'input_pinned_memory': {'enable': True}, 'output_pinned_memory': {'enable': True}, 'gather_kernel_buffer_threshold': 0, 'eager_batching': False}, 'instance_group': [{'name': 'det_preprocess_0', 'kind': 'KIND_CPU', 'count': 1, 'gpus': [], 'secondary_devices': [], 'profile': [], 'passive': False, 'host_policy': ''}], 'default_model_filename': '', 'cc_model_filenames': {}, 'metric_tags': {}, 'parameters': {}, 'model_warmup': []} preprocess input names: ['INPUT_0'] preprocess output names: ['OUTPUT_0', 'OUTPUT_1'] I0406 06:06:26.601789 97 model_repository_manager.cc:1183] successfully loaded 'det_preprocess' version 1 I0406 06:06:26.602037 97 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: rec_postprocess_0 (CPU device 0) model_config: {'name': 'rec_postprocess', 'platform': '', 'backend': 'python', 'version_policy': {'latest': {'num_versions': 1}}, 'max_batch_size': 128, 'input': [{'name': 'POST_INPUT_0', 'data_type': 'TYPE_FP32', 'format': 'FORMAT_NONE', 'dims': [-1, 6625], 'is_shape_tensor': False, 'allow_ragged_batch': False}], 'output': [{'name': 'POST_OUTPUT_0', 'data_type': 'TYPE_STRING', 'dims': [1], 'label_filename': '', 'is_shape_tensor': False}, {'name': 'POST_OUTPUT_1', 'data_type': 'TYPE_FP32', 'dims': [1], 'label_filename': '', 'is_shape_tensor': False}], 'batch_input': [], 'batch_output': [], 'optimization': {'priority': 'PRIORITY_DEFAULT', 'input_pinned_memory': {'enable': True}, 'output_pinned_memory': {'enable': True}, 'gather_kernel_buffer_threshold': 0, 'eager_batching': False}, 'instance_group': [{'name': 'rec_postprocess_0', 'kind': 'KIND_CPU', 'count': 1, 'gpus': [], 'secondary_devices': [], 'profile': [], 'passive': False, 'host_policy': ''}], 'default_model_filename': '', 'cc_model_filenames': {}, 'metric_tags': {}, 'parameters': {}, 'model_warmup': []} postprocess input names: ['POST_INPUT_0'] postprocess output names: ['POST_OUTPUT_0', 'POST_OUTPUT_1'] I0406 06:06:26.882847 97 model_repository_manager.cc:1183] successfully loaded 'rec_postprocess' version 1 I0406 06:06:26.882963 97 server.cc:522] +------------------+------+ | Repository Agent | Path | +------------------+------+ +------------------+------+

I0406 06:06:26.883003 97 server.cc:549] +---------+-------------------------------------------------------+--------+ | Backend | Path | Config | +---------+-------------------------------------------------------+--------+ | python | /opt/tritonserver/backends/python/libtriton_python.so | {} | +---------+-------------------------------------------------------+--------+

I0406 06:06:26.883193 97 server.cc:252] Waiting for in-flight requests to complete. I0406 06:06:26.883202 97 model_repository_manager.cc:1055] unloading: rec_postprocess:1 I0406 06:06:26.883252 97 model_repository_manager.cc:1055] unloading: det_preprocess:1 I0406 06:06:26.883298 97 model_repository_manager.cc:1055] unloading: det_postprocess:1 I0406 06:06:26.883367 97 model_repository_manager.cc:1055] unloading: cls_postprocess:1 I0406 06:06:26.883393 97 server.cc:267] Timeout 30: Found 4 live models and 0 in-flight non-inference requests I0406 06:06:27.883475 97 server.cc:267] Timeout 29: Found 4 live models and 0 in-flight non-inference requests Cleaning up... Cleaning up... Cleaning up... Cleaning up... I0406 06:06:27.944011 97 model_repository_manager.cc:1166] successfully unloaded 'det_postprocess' version 1 I0406 06:06:27.945687 97 model_repository_manager.cc:1166] successfully unloaded 'cls_postprocess' version 1 I0406 06:06:27.945913 97 model_repository_manager.cc:1166] successfully unloaded 'rec_postprocess' version 1 I0406 06:06:27.946568 97 model_repository_manager.cc:1166] successfully unloaded 'det_preprocess' version 1 I0406 06:06:28.883892 97 server.cc:267] Timeout 28: Found 0 live models and 0 in-flight non-inference requests error: creating server: Internal - failed to load all models