PaddleSlim paddleslim量化int8模型，paddle-inference部署推理报错

paddleslim量化int8模型，paddle-inference部署推理报错

Open Memoriaaa opened this issue 2 years ago • 5 comments

按照离线动态量化教程https://paddle-inference.readthedocs.io/en/latest/guides/x86_cpu_infer/paddle_x86_cpu_int8.html#id1 ，用quant_post_static保存量化模型，再用脚本转成int8模型。教程里的paddle.models.mobilenetv1没有问题，量化、部署都ok，但是换UNet就会在inference推理时报错（转int8始终没有问题），看起来是和concat有关。

InvalidArgumentError: Tensor holds the wrong type, it holds int8_t, but desires to be uint8_t.
  [Hint: Expected valid == true, but received valid:0 != true:1.] (at /home/wangye19/Paddle/paddle/fluid/framework/tensor_impl.h:33)
  [operator < concat > error]

通过反复实验最后抽取出下面这个有问题的小模型，不理解的是只要去掉relu或interpolate就没问题，不知道原因

class FooNet(nn.Layer):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2D(3, 1, 1)
        self.bn = nn.BatchNorm2D(1)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2D(2, 1, 1)

    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        x = self.relu(x)
        x2 = F.interpolate(
            x,
            scale_factor=1,
            mode='bilinear',
            align_corners=True)
        y = paddle.concat([x, x2], axis=1)
        out = self.conv2(y)
        return out

我用的版本是paddle2.2.1+paddleslim2.2.2+paddle-inference2.1.1，在cpu上推理

Jul 11 '22 11:07 Memoriaaa

补充一下paddleseg的量化教程里用的BiSeNetV2，我部署了也没问题，只有UNet、U2Net由于将ReLU和interpolate的输出concat到一起，推理时才报这个错

Jul 12 '22 04:07 Memoriaaa

使用上面的FooNet可以复现问题的是吧？我先复现问题看看，具体是什么原因

Jul 12 '22 06:07 yeliang2258

你好，你升级一下paddleinference为2.3.1版本来运行吧，这个问题在新版本里面已经解决了，我这边可以正常运行。升级2.3.1之后不需要再另外使用脚本转换量化模型，直接使用如下脚本便可运行示例如下：

config = paddle.inference.Config("model_path/model.pdmodel", "model_path/model.pdiparams")
config.disable_gpu()
config.switch_ir_optim(True)
# 如果使用MKLDNN做INT8预测，可以开启以下两个配置
config.enable_mkldnn()
config.enable_mkldnn_int8()
# 也可以使用ORT做INT8预测，在量化时配置onnx_format=True和is_full_quantize=True，然后开启以下两个配置
# config.enable_onnxruntime()
# config.enable_ort_optimization()
predictor = paddle.inference.create_predictor(config)

Jul 12 '22 07:07 yeliang2258

你好，你升级一下paddleinference为2.3.1版本来运行吧，这个问题在新版本里面已经解决了，我这边可以正常运行。升级2.3.1之后不需要再另外使用脚本转换量化模型，直接使用如下脚本便可运行示例如下：
config = paddle.inference.Config("model_path/model.pdmodel", "model_path/model.pdiparams")
config.disable_gpu()
config.switch_ir_optim(True)
# 如果使用MKLDNN做INT8预测，可以开启以下两个配置
config.enable_mkldnn()
config.enable_mkldnn_int8()
# 也可以使用ORT做INT8预测，在量化时配置onnx_format=True和is_full_quantize=True，然后开启以下两个配置
# config.enable_onnxruntime()
# config.enable_ort_optimization()
predictor = paddle.inference.create_predictor(config)

谢谢，版本升级后可以了，不过好像只要升级paddleslim就行，inference2.1还能用。新版inference不需要转换是指不用转int8吗，quant_post_static还是需要的吧

Jul 12 '22 09:07 Memoriaaa

quant_post_static是量化模型，是需要的，但是不需要再使用save_quant_model.py这个脚本来再转一次模型了

Jul 12 '22 09:07 yeliang2258

PaddleSlim PaddleSlim copied to clipboard

paddleslim量化int8模型，paddle-inference部署推理报错

PaddleSlim
PaddleSlim copied to clipboard