PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

ernie3.0量化过程报错:Hint: Expected dtype() == paddle::experimental::CppTypeToDataType<T>::Type()

Open Fmaj7 opened this issue 3 years ago • 26 comments

欢迎您反馈PaddleNLP使用问题,非常感谢您对PaddleNLP的贡献! 在留下您的问题时,辛苦您同步提供如下信息:

  • 版本、环境信息 1)PaddleNLP和PaddlePaddle版本:PaddleNLP 2.3.4,paddlepaddle-gpu 2.3.1.post116 2)系统环境:Windows10企业版,python38,cuda11.6,cudnn8.4
  • 复现信息:ernie3.0模型量化出错,出错处最后调用的应该是c/c++编译的包了,无法继续排查了,错误信息如下图:
- Traceback (most recent call last):
  File "compress_msra_ner.py", line 149, in <module>
    main()
  File "compress_msra_ner.py", line 142, in main
    trainer.compress(output_dir,
  File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\compress_trainer.py", line 179, in compress
    self.quant(original_inference_model_dir, output_dir,
  File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\compress_trainer.py", line 201, in quant
    _post_training_quantization_grid_search(eval_dataloader, self.eval_dataset,
  File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\compress_trainer.py", line 623, in _post_training_quantization_grid_search
    _post_training_quantization(algo, batch_size)
    post_training_quantization.quantize()
  File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\contrib\slim\quantization\post_training_quantization.py", line 379, in quantize
    self._executor.run(program=self._program,
  File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\executor.py", line 1300, in run
    six.reraise(*sys.exc_info())
  File "C:\Users\admin\AppData\Roaming\Python\Python38\site-packages\six.py", line 719, in reraise
    raise value
  File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\executor.py", line 1286, in run
    res = self._run_impl(
  File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\executor.py", line 1467, in _run_impl
    return new_exe.run(list(feed.keys()), fetch_list, return_numpy)
  File "C:\Program Files\Python38\lib\site-packages\paddle\fluid\executor.py", line 547, in run
    tensors = self._new_exe.run(feed_names, fetch_list)._move_to_list()
ValueError: (InvalidArgument) The type of data we are trying to retrieve does not match the type of data currently contained in the container.
  [Hint: Expected dtype() == paddle::experimental::CppTypeToDataType<T>::Type(), but received dtype():5 != paddle::experimental::CppTypeToDataType<T>::Type():7.] (at ..\paddle\phi\core\dense_tensor.cc:137)

Fmaj7 avatar Aug 08 '22 09:08 Fmaj7

看 报错信息应该具体是这几行,看起来是数据的dtype不匹配

ValueError: (InvalidArgument) The type of data we are trying to retrieve does not match the type of data currently contained in the container.
[Hint: Expected dtype() == paddle::experimental::CppTypeToDataType::Type(), but received dtype():5 != paddle::experimental::CppTypeToDataType::Type():7.] (at ..\paddle\phi\core\dense_tensor.cc:137)

可以先检查一下模型的输入需要的dtype,和dataset/data_loader出来的数据的dtype是否匹配,常见的有int32和int64等~

LiuChiachi avatar Aug 08 '22 11:08 LiuChiachi

万分感谢,改成int32可以了!

Fmaj7 avatar Aug 09 '22 00:08 Fmaj7

1、data_loader出来的type: {'input_ids': Tensor(shape=[32, 127], dtype=int64, place=Place(gpu:0), stop_gradient=True 2、裁剪的时候dtype配置为int32: elif quantization: input_dir = compress_config.quantization_config.input_dir if input_dir is None: compress_config.quantization_config.input_filename_prefix = "model" input_spec = [ paddle.static.InputSpec(shape=[None, None], dtype="int32"), # input_ids paddle.static.InputSpec(shape=[None, None], dtype="int32") # segment_ids ] 3、量化成功 4、打开 set_dynamic_shape 开关,自动配置动态shape出现新问题,看样子还是那个int64问题: python infer_gpu.py --task_name token_cls --model_path ./msra_ner_quant_infer_model/int8 --shape_info_file dynamic_shape_info.txt --set_dynamic_shape 错误如下: Traceback (most recent call last): File "./deploy/python/infer_gpu.py", line 94, in main() File "./deploy/python/infer_gpu.py", line 82, in main predictor = ErniePredictor(args) File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\deploy\python\ernie_predictor.py", line 296, in init self.set_dynamic_shape(args.max_seq_length, args.batch_size) File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\deploy\python\ernie_predictor.py", line 405, in set_dynamic_shape self.inference_backend.infer(batch) File "D:\AI\PaddleNLP-develop\model_zoo\ernie-3.0\deploy\python\ernie_predictor.py", line 203, in infer self.predictor.run() RuntimeError: (NotFound) Operator (quantize_linear) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)]; place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < quantize_linear > error]

我的dtype配置int32或int64也不行: `def token_cls_preprocess(self, data: list):

tokenizer + pad

is_split_into_words = False if isinstance(data[0], list): is_split_into_words = True data = self.tokenizer(data, max_length=self.max_seq_length, padding=True, truncation=True, is_split_into_words=is_split_into_words)

input_ids = data["input_ids"]
token_type_ids = data["token_type_ids"]
return {
    "input_ids": np.array(input_ids, dtype="int64"),
    "token_type_ids": np.array(token_type_ids, dtype="int64")
}`

Fmaj7 avatar Aug 09 '22 01:08 Fmaj7

您好,set_dynamic_shape函数中用的是int64类型自己构造的数据,https://github.com/PaddlePaddle/PaddleNLP/blob/2a4a2fb69f577d9622bdb51ecb44b98a5b0145da/model_zoo/ernie-3.0/deploy/python/ernie_predictor.py#L379 您可以点开详细看一下 不是您的输入数据,可能需要您组网这里统一下dtype

LiuChiachi avatar Aug 10 '22 06:08 LiuChiachi

您好,有点疑惑,请问不是我输入的数据指的是哪个地方输入的,组网统一dtype指的是在函数set_dynamic_shape里面统一吗?我之前尝试过修改set_dynamic_shape里面的dtype,但是出现同样的错误了 补充下:量化过程中出现如下告警,不知有没影响: Wed Aug 10 16:04:28-INFO: Collect quantized variable names ... Wed Aug 10 16:04:28-WARNING: feed is not supported for quantization. Wed Aug 10 16:04:28-WARNING: feed is not supported for quantization. Wed Aug 10 16:04:28-WARNING: scale is not supported for quantization.

Fmaj7 avatar Aug 10 '22 07:08 Fmaj7

  • Q1: 请问不是我输入的数据指的是哪个地方输入的:

  • A1:是set_dynamic_shape它会构造数据,这个set_dynamic_shape过程用到的数据和你的输入数据无关,通过代码看它是构造的int64的数据: https://github.com/PaddlePaddle/PaddleNLP/blob/2a4a2fb69f577d9622bdb51ecb44b98a5b0145da/model_zoo/ernie-3.0/deploy/python/ernie_predictor.py#L384-L389

  • Q2:组网统一dtype

  • A2:还是需要保证网络希望的输入dtype和你实际给的数据的dtype一致,如果还是不成功,可以发来代码一起看一下

  • Q3: 量化过程中出现如下告警,不知有没影响:

  • A3: Warning应该是不会有影响的

LiuChiachi avatar Aug 10 '22 13:08 LiuChiachi

模型训练:run_msra_ner.py python run_token_cls.py --task_name msra_ner --model_name_or_path ernie-3.0-medium-zh --do_train

裁剪: 1、compress_msra_ner.py 2、compress_trainer.py python compress_msra_ner.py --dataset "msra_ner" --model_name_or_path best_msra_ner_model --output_dir ./

量化:裁剪步骤文件1compress设置:pruning=False, quantization=True,文件2修改dtype为int32(dtype设置为int64会出错):input_spec = [ paddle.static.InputSpec(shape=[None, None], dtype="int32"), # input_ids paddle.static.InputSpec(shape=[None, None], dtype="int32") # segment_ids ] python compress_msra_ner.py --dataset "msra_ner" --model_name_or_path best_msra_ner_model --output_dir ./

部署:ernie_preditctor.py python ./deploy/python/infer_gpu.py --task_name token_cls --model_path ./best_msra_ner_model/compress/hist16/int8 --shape_info_file dynamic_shape_info.txt --set_dynamic_shape

部署发生错误: RuntimeError: (NotFound) Operator (quantize_linear) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)]; place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < quantize_linear > error]

还有个gpu内存问题: 直接执行部署脚本跑裁剪后的模型,运行结束后gpu内存会释放掉: python infer_gpu.py --task_name token_cls --model_path ./msra_ner_pruned_infer_model/float32 但是,如果启动一个后台服务(http服务),用接口引入 infer_gpu.main执行,接口调用完后gpu内存不会释放,且调用一次叠加一次如:1g->2g...直到内存爆了

现在ernie3的部署只支持seq、token?

Fmaj7 avatar Aug 11 '22 01:08 Fmaj7

您好,抱歉回复不及时,您试试把compress_trainer.py中的onnx_format参数设为False,为True的情况目前可能还不支持,正在排查中了。

LiuChiachi avatar Aug 15 '22 03:08 LiuChiachi

onnx_format设为False还是出错了

Fmaj7 avatar Aug 15 '22 03:08 Fmaj7

onnx_format设为False还是出错了

Fmaj7 avatar Aug 15 '22 03:08 Fmaj7

@Fmaj7 onnx_format设为False,然后重新导出量化模型,预测时的报错信息可以发下吗?

yghstill avatar Aug 15 '22 05:08 yghstill

onnx_format=False,执行量化出错,如下: }C8JML4XIDD1)18U70I K3

Fmaj7 avatar Aug 15 '22 06:08 Fmaj7

看 报错信息应该具体是这几行,看起来是数据的dtype不匹配

ValueError: (InvalidArgument) The type of data we are trying to retrieve does not match the type of data currently contained in the container.
[Hint: Expected dtype() == paddle::experimental::CppTypeToDataType::Type(), but received dtype():5 != paddle::experimental::CppTypeToDataType::Type():7.] (at ..\paddle\phi\core\dense_tensor.cc:137)

可以先检查一下模型的输入需要的dtype,和dataset/data_loader出来的数据的dtype是否匹配,常见的有int32和int64等~

@Fmaj7 看报错和这个一样,按照这样改下呢?

yghstill avatar Aug 15 '22 06:08 yghstill

已经试过了,dtype设置为int32量化可以通过,设置为int64就报上面的错误,但是当设置为int32通过完成量化后,再执行:python ./deploy/python/infer_gpu.py --task_name token_cls --model_path ./best_msra_ner_model/compress/hist16/int8 --shape_info_file dynamic_shape_info.txt --set_dynamic_shape,则出现以下错误: RuntimeError: (NotFound) Operator (quantize_linear) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)]; place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < quantize_linear > error]

Fmaj7 avatar Aug 15 '22 06:08 Fmaj7

quantize_linear 这个算子是在onnx_format=True下出现的,你需要将dtype设置为int32,同时onnx_format=False

yghstill avatar Aug 15 '22 12:08 yghstill

是的,dtype=int32,onnx_format=False可以通过量化(实际上我测试的时候只设置dtype=int32就通过量化了),但是上面--set_dynamic_shape又出错了,如下: RuntimeError: (NotFound) Operator (fake_quantize_dequantize_moving_average_abs_max) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)]; place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < fake_quantize_dequantize_moving_average_abs_max > error]

Fmaj7 avatar Aug 15 '22 13:08 Fmaj7

这个问题可以先将您的ernie_predictor.py中的set_dynamic_shape方法中的int64也都改为int32,应该可以绕过

LiuChiachi avatar Aug 16 '22 02:08 LiuChiachi

不行,前几天试过了,刚刚也试过,这个问题真困惑,是平台不兼容还是其他原因呢! UI(6O6{IKPZMFAEH36FSC2

Fmaj7 avatar Aug 16 '22 02:08 Fmaj7

改用wsl测试,量化参数:dtype=int64,onnx_frmat=False可以通过量化,但执行--set_dynamic_shape还是不行,ernie_predictor.py里面setdynamic_shape中无论都是int64或int32都不行,错误信息同上

Fmaj7 avatar Aug 17 '22 05:08 Fmaj7

能够再提供下.pdmodel文件吗。因为 fake_quantize_dequantize_moving_average_abs_max 这个算子在 ERNIE模型下输入确实不应该是 int32

LiuChiachi avatar Aug 17 '22 05:08 LiuChiachi

int8.zip 这个是量化输出的文件,量化compress_train.py参数:dtype=int64,onnx_frmat=False

Fmaj7 avatar Aug 17 '22 05:08 Fmaj7

请确认将 onnx_format=False,应该是compress_trainer.py这个文件里 PostTrainingQuantization的初始化

LiuChiachi avatar Aug 22 '22 08:08 LiuChiachi

您好,请问问题解决了吗?我也遇到相同的问题了

Renxs177 avatar Oct 11 '22 06:10 Renxs177

您好,请问问题解决了吗?我也遇到相同的问题了

您好,把报错截图发出来一起看一下吧

LiuChiachi avatar Oct 11 '22 07:10 LiuChiachi

我用的是paddleslim的自动压缩,压缩的策略是执行的离线量化。报的类似的错误。 image

Renxs177 avatar Oct 11 '22 08:10 Renxs177

好像没有解决,我后面用的是in-batch-negative,然后做paddle serving部署,没有做压缩了,检索速度还是蛮快的,gpu训练模型,cpu上面跑检索速度0.3s左右

Fmaj7 avatar Oct 11 '22 09:10 Fmaj7

请问问题解决了没有,也遇到相似的问题。 paddle训练的模型直接进行量化操作(onnx_format=True, dtype=int64),得到量化模型后,进行推理时报错: RuntimeError: (NotFound) Operator (quantize_linear) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)]; place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < quantize_linear > error]

改为(onnx_format=False, dtype=int64)得到量化模型后推理报错: RuntimeError: (NotFound) Operator (fake_quantize_dequantize_moving_average_abs_max) does not have kernel for {data_type[int64_t]; data_layout[Undefined(AnyLayout)];place[Place(gpu:0)]; library_type[PLAIN]}. [Hint: Expected kernel_iter != kernels.end(), but received kernel_iter == kernels.end().] (at ..\paddle\fluid\framework\operator.cc:1712) [operator < fake_quantize_dequantize_moving_average_abs_max > error]

tianjiahao avatar Nov 23 '22 09:11 tianjiahao

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] avatar Jan 23 '23 00:01 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

github-actions[bot] avatar Feb 06 '23 00:02 github-actions[bot]