Serving
Serving copied to clipboard
paddleServing部署时,启动http客户端,报错。
ppocr模型中,检测模型是自己训练的,识别模型用的是官方的,将其组合在一起,服务部署的时候,服务端启动正常,客户端Pipeline_http_client启动时,报错,看日志提示是:
ERROR 2022-04-24 09:01:22,719 [error_catch.py:125]
Log_id: 0
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/paddle_serving_server/pipeline/error_catch.py", line 97, in wrapper
res = func(*args, **kw)
File "/usr/local/lib/python3.6/site-packages/paddle_serving_server/pipeline/operator.py", line 1156, in postprocess_help
midped_data, data_id, logid_dict.get(data_id))
File "web_service.py", line 95, in postprocess
det_out = fetch_dict["save_infer_model/scale_0.tmp_1"]
KeyError: 'save_infer_model/scale_0.tmp_1'
Classname: Op._run_postprocess.
我的检测模型输出serving_server_conf.prototxt文件是: $ cat serving_server_conf.prototxt feed_var { name: "x" alias_name: "x" is_lod_tensor: false feed_type: 1 shape: 3 } fetch_var { name: "sigmoid_0.tmp_0" alias_name: "sigmoid_0.tmp_0" is_lod_tensor: false fetch_type: 1 shape: 1 }
识别模型的 .prototxt文件是: $ cat serving_server_conf.prototxt feed_var { name: "x" alias_name: "x" is_lod_tensor: false feed_type: 1 shape: 3 shape: 32 shape: 100 } fetch_var { name: "save_infer_model/scale_0.tmp_1" alias_name: "save_infer_model/scale_0.tmp_1" is_lod_tensor: false fetch_type: 1 shape: 25 shape: 6625 }
我是采用docker的方式部署的,请问各位大佬,这种问题咋解决呢。
Message that will be displayed on users' first issue
#rpc端口, rpc_port和http_port不允许同时为空。当rpc_port为空且http_port不为空时,会自动将rpc_port设置为http_port+1 rpc_port: 18090
#http端口, rpc_port和http_port不允许同时为空。当rpc_port可用且http_port为空时,不自动生成http_port http_port: 9999
#worker_num, 最大并发数。当build_dag_each_worker=True时, 框架会创建worker_num个进程,每个进程内构建grpcSever和DAG ##当build_dag_each_worker=False时,框架会设置主线程grpc线程池的max_workers=worker_num worker_num: 20
#build_dag_each_worker, False,框架在进程内创建一条DAG;True,框架会每个进程内创建多个独立的DAG build_dag_each_worker: false
dag: #op资源类型, True, 为线程模型;False,为进程模型 is_thread_op: False
#重试次数
retry: 1
#使用性能分析, True,生成Timeline性能数据,对性能有一定影响;False为不使用
use_profile: false
tracer:
interval_s: 10
op: det: #并发数,is_thread_op=True时,为线程并发;否则为进程并发 concurrency: 6
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf:
#client类型,包括brpc, grpc和local_predictor.local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#det模型路径
model_config: ./myPipelineServingConvertModel/ppocrv2_det_serving
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["sigmoid_0.tmp_0"]
# device_type, 0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 0
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: ""
#use_mkldnn
#use_mkldnn: True
#thread_num
thread_num: 2
#ir_optim
ir_optim: True
#开启tensorrt后,进行优化的子图包含的最少节点数
#min_subgraph_size: 13
rec:
#并发数,is_thread_op=True时,为线程并发;否则为进程并发
concurrency: 3
#超时时间, 单位ms
timeout: -1
#Serving交互重试次数,默认不重试
retry: 1
#当op配置没有server_endpoints时,从local_service_conf读取本地服务配置
local_service_conf:
#client类型,包括brpc, grpc和local_predictor。local_predictor不启动Serving服务,进程内预测
client_type: local_predictor
#rec模型路径
model_config: ./myPipelineServingConvertModel/ppocrv2_rec_serving
#Fetch结果列表,以client_config中fetch_var的alias_name为准
fetch_list: ["save_infer_model/scale_0.tmp_1"]
# device_type, 0=cpu, 1=gpu, 2=tensorRT, 3=arm cpu, 4=kunlun xpu
device_type: 0
#计算硬件ID,当devices为""或不写时为CPU预测;当devices为"0", "0,1,2"时为GPU预测,表示使用的GPU卡
devices: ""
#use_mkldnn
#use_mkldnn: True
#thread_num
thread_num: 2
#ir_optim
ir_optim: True
#开启tensorrt后,进行优化的子图包含的最少节点数
#min_subgraph_size: 3
上面是我的配置文件config.yml
你好,报错信息是fetch结果找不到。
det_out = fetch_dict["save_infer_model/scale_0.tmp_1"]
KeyError: 'save_infer_model/scale_0.tmp_1'
可以打印一下fetch_dict,看输出结果是什么,是否是在推理过程报错,如果推理过程报错要检查 preprocess传入 process结果的数据是否正确
{'sigmoid_0.tmp_0': array([[[[2.6263534e-09, 1.2464998e-10, 2.7525771e-10, ..., 4.7626424e-11, 4.1451881e-10, 5.5798965e-10], [7.4597697e-09, 1.5542181e-09, 2.0814959e-09, ..., 7.1834309e-11, 3.2807965e-10, 4.7681823e-09], [1.0438104e-10, 5.1240331e-11, 1.0613148e-09, ..., 1.0933918e-12, 1.4041504e-10, 2.4115937e-10], ..., [8.7470253e-09, 4.8530628e-09, 2.1917150e-09, ..., 2.1170598e-27, 1.4422103e-22, 2.5260843e-22], [1.6273503e-09, 8.1889040e-10, 3.4213756e-09, ..., 1.6775331e-19, 1.1194289e-17, 7.4356507e-16], [4.3284736e-09, 3.7965120e-09, 6.5396843e-09, ..., 9.0908091e-15, 1.0334395e-18, 3.6553197e-17]]]], dtype=float32)}
感谢解答,这个是打印的fetch_dict
返回fetch_var名称是 sigmoid_0.tmp_0 ,与模型prototxt中的fetch_var不一致。
fetch_var { name: "save_infer_model/scale_0.tmp_1" alias_name: "save_infer_model/scale_0.tmp_1" is_lod_tensor: false fetch_type: 1 shape: 25 shape: 6625 }
自己训练的模型的返回结果和示例不一致的问题,所以不能仅替换模型,需要用模型保存参数方法,把模型参数重新保存,安装paddle_serving_client,并运行以下命令
python -m paddle_serving_client.convert --dirname . --model_filename dygraph_model.pdmodel --params_filename dygraph_model.pdiparams --serving_server serving_server --serving_client serving_client
检查保存出的 serving_server_conf.prototxt 的名称是否包含 sigmoid_0.tmp_0,再修改config.yml中模型路径,启动服务即可。
嗯,谢谢,已经修改好了,但是cpu版的部署方式,好慢好慢呀。现在通过postman模拟调用pipeline_http_client的请求,一张图片的预测耗时33s左右。
本来CPU推理就慢
CPU推理的性能是比较慢的,本周写了一个文档关于低精度推理部署,可以加速推理时间,您可以按上面的测试方法修改试试。
https://github.com/PaddlePaddle/Serving/blob/develop/doc/Offical_Docs/7-2_Python_Pipeline_Senior_CN.md