PaddleDetection icon indicating copy to clipboard operation
PaddleDetection copied to clipboard

MaskRCNN 开启mkldnn预测出错

Open JayLee15 opened this issue 2 years ago • 8 comments

问题确认 Search before asking

  • [X] 我已经查询历史issue,没有报过同样bug。I have searched the issues and found no similar bug report.

bug描述 Describe the Bug

MaskRCNN infer时开启mkldnn后 结果矩阵混乱 关闭mkldnn: image 开启mkldnn: image

复现环境 Environment

模型: mask_rcnn_r50_vd_fpn_2x: https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml cascade_mask_rcnn_r50_fpn_1x: https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml 复现代码: https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py 复现步骤: 下载官方提供的coco预训练模型,使用infer.py进行推理,参数中指定开启mkldnn,结果出错 paddle版本:2.2.2 和 2.3均出现同样的问题 语言:python 系统:linux:Linux version 3.10.0-1160.42.2.el7.x86_64(ubuntu 16.04) windows:Windows Server 2019

是否愿意提交PR Are you willing to submit a PR?

  • [ ] Yes I'd like to help by submitting a PR!

JayLee15 avatar Jun 28 '22 05:06 JayLee15

测试了多张图片发现依然存在上述问题

不开启mkldnn结果: {'boxes': array([[0.0000000e+00, 2.9145604e-01, 2.6589603e+02, 3.6681244e+02, 3.0432248e+02, 4.1769321e+02], [1.1000000e+01, 1.1651322e-01, 7.1439152e+00, 3.7374794e+01, 4.2610181e+02, 5.0756027e+02], [7.3000000e+01, 3.3812910e-01, 7.8442383e-01, 3.6608387e+01, 4.4896344e+02, 5.0576721e+02], [7.4000000e+01, 1.3708681e-01, 4.1523376e+00, 4.0477535e+01, 4.2329196e+02, 4.9252173e+02], [7.4000000e+01, 1.4600465e-01, 2.8948755e+01, 2.8948956e+02, 3.4697556e+02, 4.6286768e+02]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array([5], dtype=int32)} 开启mkldnn结果: {'boxes': array([[0.00000000e+00, 1.10000000e+01, 7.30000000e+01, 7.40000000e+01, 7.40000000e+01, 2.91456193e-01], [1.16512574e-01, 3.38128805e-01, 1.37086943e-01, 1.46004632e-01, 2.65896027e+02, 7.14395142e+00], [7.84423828e-01, 4.15233755e+00, 2.89487553e+01, 3.66812439e+02, 3.73747940e+01, 3.66083870e+01], [4.04775696e+01, 2.89489563e+02, 3.04322479e+02, 4.26101807e+02, 4.48963409e+02, 4.23291962e+02], [3.46975494e+02, 4.17693207e+02, 5.07560364e+02, 5.05767151e+02, 4.92521729e+02, 4.62867676e+02]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 1, 1, ..., 0, 0, 0], [1, 1, 1, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array( [5], dtype=int32) } 000000226111

JayLee15 avatar Jun 28 '22 05:06 JayLee15

不开启mkldnn: {'boxes': array([[ 74. , 0.9945931 , 228.81963 , 152.98608 , 299.21884 , 223.9967 ], [ 74. , 0.95573527, 174.15001 , 180.1396 , 201.7896 , 236.1698 ]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array([2], dtype=int32)} 开启mkldnn: {'boxes': array([[ 74. , 74. , 0.99459285, 0.95573527, 228.81963 , 174.15001 ], [152.98608 , 180.1396 , 299.21884 , 201.7896 , 223.9967 , 236.1698 ]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array([2], dtype=int32)} 000000111179

JayLee15 avatar Jun 28 '22 05:06 JayLee15

不开启mkldnn: {'boxes': array([[7.4000000e+01, 7.0395261e-02, 3.8677655e+02, 1.3604433e+02, 4.0381525e+02, 2.9166379e+02], [7.4000000e+01, 5.1174235e-02, 3.6764462e+02, 1.8386264e+02, 3.8638199e+02, 2.4731769e+02]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array([2], dtype=int32)} 开启mkldnn: {'boxes': array([[7.4000000e+01, 7.4000000e+01, 7.0395000e-02, 5.1174238e-02, 3.8677655e+02, 3.6764462e+02], [1.3604434e+02, 1.8386263e+02, 4.0381525e+02, 3.8638199e+02, 2.9166376e+02, 2.4731766e+02]], dtype=float32), 'masks': array([[[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], [[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]]], dtype=int32), 'boxes_num': array([2], dtype=int32)} 000000058636

JayLee15 avatar Jun 28 '22 05:06 JayLee15

同 https://github.com/PaddlePaddle/Paddle/issues/43883 ; https://github.com/PaddlePaddle/Paddle/issues/43538 ,统一在这里跟进~

yghstill avatar Jun 28 '22 11:06 yghstill

目前进展:正在排查解决中

yghstill avatar Jun 28 '22 11:06 yghstill

这个有什么进展吗?

JayLee15 avatar Jul 13 '22 06:07 JayLee15

@JayLee15 我这边测试了你上面给的几个图片,最新版本的Paddle develop开启和关闭MKLDNN结果是对齐的,你可以使用最新的Paddle develop测试一下。 安装链接:https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html

yeliang2258 avatar Aug 10 '22 06:08 yeliang2258

image 开启MKLDNN后结果正确

yeliang2258 avatar Aug 10 '22 06:08 yeliang2258