Paddle icon indicating copy to clipboard operation
Paddle copied to clipboard

模型推理报错,请教问题,ValueError: (InvalidArgument) Deserialize to tensor failed

Open lovychen opened this issue 7 months ago • 2 comments

模型推理报错,请教出现错误原因

我用paddle版本,paddlepaddle-gpu 2.6.1 通过torch2paddle,转换得到模型文件目录:

├── inference_model │   ├── model.pdiparams │   ├── model.pdiparams.info │   └── model.pdmodel ├── layer_model │   ├── __ model__ -> ../inference_model/model.pdmodel │   ├── __params__ -> ../inference_model/model.pdiparams │   ├── sentence_transformer_0.w_0 │   └── sentence_transformer_0.w_1 ├── model.pdparams ├── pycache │   └── x2paddle_code.cpython-311.pyc └── x2paddle_code.py

目前:采用这种方式进行推理,结果正确:

import argparse
import numpy as np
from transformers import AutoTokenizer
#引用 paddle inference 预测库
import paddle.inference as paddle_infer
def get_embedding():
    loaded_vector = np.load('query_doc_test.npy')
    return loaded_vector[0],loaded_vector[1]
def main():
    args = parse_args()
    # 创建 config
    config = paddle_infer.Config(args.model_file, args.params_file)
    #根据 config 创建 predictor
    predictor = paddle_infer.create_predictor(config)
    # 获取输入的名称
    input_names = predictor.get_input_names()
    input_handle_query = predictor.get_input_handle(input_names[0])
    input_handle_doc = predictor.get_input_handle(input_names[1])
    print("input_names",input_names)
    #获取输入向量
    fake_query_emb,fake_doc_emb = get_embedding()
    # 设置输入
    fake_input1 = np.random.randn(args.batch_size,768).astype("float32")
    fake_input2 = np.random.randn(args.batch_size,768).astype("float32")
    input_handle_query.reshape([args.batch_size,768])
    input_handle_doc.reshape([args.batch_size,768])
    #input_handle_query.copy_from_cpu(fake_input1)
    #input_handle_doc.copy_from_cpu(fake_input2)
    input_handle_query.copy_from_cpu(fake_query_emb)
    input_handle_doc.copy_from_cpu(fake_doc_emb)
    # 运行predictor
    predictor.run()
    # 获取输出
    output_names = predictor.get_output_names()
    output_handle = predictor.get_output_handle(output_names[0])
    output_data = output_handle.copy_to_cpu() # numpy.ndarray类型
    print("Output data is {}".format(output_data))
    print("Output data size is {}".format(output_data.size))
    print("Output data shape is {}".format(output_data.shape))
def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--model_file", type=str, help="model filename")
    parser.add_argument("--params_file", type=str, help="parameter filename")
    parser.add_argument("--batch_size", type=int, default=1, help="batch size")
    return parser.parse_args()
if __name__ == "__main__":
    main()

但是线上目前是老的推理服务:需要把 model.pdiparams 拆分 为层次结构,也就是这种结构: │   ├── __model__ -> ../inference_model/model.pdmodel (由原始软链过来) │   ├── sentence_transformer_0.w_0 │   └── sentence_transformer_0.w_1 目前拆分后通过这个函数导入的过程中: 代码:会自动寻找__model__,然后根据层次结构文件,匹配文件名,读取向量

import paddle
import sys
#import paddle.fluild
import paddle.inference as paddle_infer`
model_file_path = "./layer_model"
config = paddle_infer.Config(model_file_path)

报错:

File "paddle_print.py", line 46, in <module>
    predictor = paddle_infer.create_predictor(config)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: (InvalidArgument) Deserialize to tensor failed, maybe the loaded file is not a paddle model(expected file format: 0, but 2929001600 found).
  [Hint: Expected version == 0U, but received version:2929001600 != 0U:0.] (at /paddle/paddle/fluid/framework/lod_tensor.cc:301)
  [operator < load > error]

另一种方式:work,没有异常

import paddle
import sys
#import paddle.fluild
import paddle.inference as paddle_infer`
model_file = "./layer_model/__model__"
params_file=".//layer_model/__params__"
config = paddle_infer.Config(model_file, params_file)

补充拆分的代码:

import paddle
from transformers import BertTokenizer, BertModel
model_path_main = "./pd_model_0710_with_softmax_nn/"
model = paddle.load(model_path_main + "/inference_model/model.pdiparams.info")
param_dict = {}
param_dict_revert = {}
param_name_list = []
for each in model: #读取模型对应的向量名称;
     #print(each,model[each]["structured_name"])
     param_dict[each] = model[each]["structured_name"]
     param_dict_revert[model[each]["structured_name"]] = each
     param_name_list.append(each)
#print(model)

print(len(param_name_list))
run_pdparams = True
if(run_pdparams): #存储到对应的名称中;
    model_2 = paddle.load(model_path_main + "/model.pdparams")
    value_all = []
    for each in model_2:
        new_dict = {}
        new_name = param_dict_revert[each]
        print(each," -> ",new_name)
        value  = model_2[each].numpy()
        new_dict[new_name] = value
        value_all.append(value)
        paddle.save(new_dict,model_path_main + "/layer_model/"+new_name)

问题是: 1、老的版本paddle,所存储的 model 是否等同于新版本的 model.pdmodel,是否能通过从命名进行使用; 2、这样的拆分是否正确呢,试了多次拆分,转numpy or不转,存储成dict,都不work,报的同样的错误; 3、该错误是由于 model 版本的不一致造成的呢,还是由于拆分的向量不对造成的呢;

lovychen avatar Jul 15 '24 04:07 lovychen