PaddleNLP
PaddleNLP copied to clipboard
[Question]: ERNIE-Layout推理代码出现InvalidArgumentError
使用DocPrompt模型进行推理,代码参考的是PaddleNLP-develop\model_zoo\ernie-layout\deploy\python\infer.py,出现以下报错:
Traceback (most recent call last):
File "/home/wpredict.py", line 57, in <module>
main()
File "/home/predict.py", line 50, in main
outputs = predictor.predict(docs)
File "/home/predictor.py", line 875, in predict
output = self.infer(input_dict)
File "/home/predictor.py", line 846, in infer
return self.inference_backend.infer(data)
File "/home/predictor.py", line 58, in infer
self.predictor.run()
ValueError: In user code:
File "models/save_infermodel.py", line 32, in <module>
phase="infer")
File "./src/multi_task_models/extract_model.py", line 154, in create_model
body_feats_feature = model.net(image)
File "./src/model/image_model/resnet.py", line 69, in net
data_format=data_format)
File "./src/model/image_model/resnet.py", line 145, in conv_bn_layer
data_format=data_format)
File "/home/work/zhangdan20/opensource/DocPrompt/py37/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 1714, in conv2d
"data_format": data_format,
File "/home/work/zhangdan20/opensource/DocPrompt/py37/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 44, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/work/zhangdan20/opensource/DocPrompt/py37/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3582, in append_op
attrs=kwargs.get("attrs", None))
File "/home/work/zhangdan20/opensource/DocPrompt/py37/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2596, in __init__
for frame in traceback.extract_stack():
InvalidArgumentError: input and filter data type should be consistent, but received input data type is int64_t and filter type is float
[Hint: Expected input_data_type == filter_data_type, but received input_data_type:3 != filter_data_type:5.] (at /paddle/paddle/fluid/operators/conv_op.cc:209)
[operator < conv2d > error]
你好,可以参考下Taskflow DocPrompt任务里predictor的构造方法
- 获取静态图参数
https://github.com/PaddlePaddle/PaddleNLP/blob/c8bc4405fc6ed7026887c7c97d6ce1afa32e300f/paddlenlp/taskflow/document_intelligence.py#L78-L80
- predictor构建
https://github.com/PaddlePaddle/PaddleNLP/blob/c8bc4405fc6ed7026887c7c97d6ce1afa32e300f/paddlenlp/taskflow/task.py#L249-L254
你好,可以参考下Taskflow DocPrompt任务里predictor的构造方法
- 获取静态图参数
https://github.com/PaddlePaddle/PaddleNLP/blob/c8bc4405fc6ed7026887c7c97d6ce1afa32e300f/paddlenlp/taskflow/document_intelligence.py#L78-L80
- predictor构建
https://github.com/PaddlePaddle/PaddleNLP/blob/c8bc4405fc6ed7026887c7c97d6ce1afa32e300f/paddlenlp/taskflow/task.py#L249-L254
predictor的构建我是沿用 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/deploy/python/predictor.py 这里的,但是出现报错
从报错信息来看是输入数据类型和 Filter 数据类型不一致导致,方便提供下复现代码进一步定位下问题么
@linjieccc
使用的代码就是https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/deploy/python/predictor.py 和 https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/deploy/python/infer.py
其中,predictor.py的 https://github.com/PaddlePaddle/PaddleNLP/blob/c8bc4405fc6ed7026887c7c97d6ce1afa32e300f/model_zoo/ernie-layout/deploy/python/predictor.py#L38-L39 修改为:
config = paddle.inference.Config(
os.path.join(model_path_prefix, "inference.pdmodel"),
os.path.join(model_path_prefix, "inference.pdiparams"),
)
其余无修改。
模型是通过Taskflow下载的docprompt,模型文件目录下包括:
@linjieccc 请问这个问题可以定位到吗?
@wjddd ernie-layout中的模型是通过jit.save的方式保存为静态图模型的,这里定义了模型的输入shape和type https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/export_model.py#L42
可以check下是否是因为DocPrompt的输入不一致导致的
@wjddd ernie-layout中的模型是通过jit.save的方式保存为静态图模型的,这里定义了模型的输入shape和type https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/export_model.py#L42
可以check下是否是因为DocPrompt的输入不一致导致的
应该就是这个问题,感谢~
我也遇到该问题,请问您怎么解决的。我定位到问题原因。 def infer(self, input_dict: dict): # self.input_names:['src_ids', 'sent_ids', 'pos_ids', '2d_pos_ids', 'segment_ids', 'task_ids', 'input_mask_mat', 'super_rel_pos', 'unique_id', 'image'],这几个字段在input_dict中除了image都是[]空的。 # input_dict的key:['id', 'question_id', 'questions', 'tokens', 'input_ids', 'attention_mask', 'token_type_ids', 'bbox', 'position_ids', 'image', 'token_is_max_context', 'token_to_orig_map', 'src_ids', 'sent_ids', 'pos_ids', '2d_pos_ids', 'segment_ids', 'task_ids'] for idx, input_name in enumerate(self.input_names): self.input_handles[idx].copy_from_cpu(input_dict[input_name]) self.predictor.run() outputs = [ output_handle.copy_to_cpu() for output_handle in self.output_handles ] return outputs
我用paddlenlp.taskflow,是可以正常运行的,打印了input的数据和上面的完全不一样。上面input_dict里面的key,token不太对,获取的字段也不太对。
@wjddd ernie-layout中的模型是通过jit.save的方式保存为静态图模型的,这里定义了模型的输入shape和type https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/export_model.py#L42 可以check下是否是因为DocPrompt的输入不一致导致的
应该就是这个问题,感谢~
老哥,我是这个问题,我是将下载的静态图作为infer的加载模型,然后就报这个错误了,请问老哥最后咋解决的呀