PaddleOCR python版本的DBPostProcess 和 fd.vision.ocr.DBDetectorPostprocessor 为什么存在精度差异

Open huotong1212 opened this issue 9 months ago • 1 comments

环境

【FastDeploy版本】： fastdeploy-python 1.0.7

问题日志及出现问题的操作流程

其中 DBPostProcess create_operators 是PaddleOCR 2.5 中的代码

class DBPostTest(unittest.TestCase):
    def setUp(self):
        url = "localhost:8008"
        self.det_runner = SyncGRPCTritonRunner(url, "v4_det_runtime", "1")

        pre_process_list = [{'DetResizeForTest': {'limit_side_len': 960, 'limit_type': 'max'}}, {
            'NormalizeImage': {'std': [0.229, 0.224, 0.225], 'mean': [0.485, 0.456, 0.406], 'scale': '1./255.',
                               'order': 'hwc'}}, {'ToCHWImage': None}, {'KeepKeys': {'keep_keys': ['image', 'shape']}}]
        self.det_preprocess_op = create_operators(pre_process_list)
        self.det_post = DBPostProcess(thresh=0.3, box_thresh=0.6, unclip_ratio=1.5)

        self.det_pre_fd = fd.vision.ocr.DBDetectorPreprocessor()
        self.det_pre_fd.max_side_len = 960

        self.det_post_fd = fd.vision.ocr.DBDetectorPostprocessor()
        self.det_post_fd.det_db_thresh = 0.3
        self.det_post_fd.det_db_box_thresh = 0.6
        self.det_post_fd.det_db_unclip_ratio = 1.5

    def test_db_py(self):
        image = cv2.imread("images/d1.jpg")
        output = transform({'image': image}, self.det_preprocess_op)
        img, shape_list = output
        print(f"shape list:{shape_list.tolist()}")
        inputs = np.expand_dims(img, axis=0)
        shape_list = np.expand_dims(shape_list, axis=0)
        outputs = self.det_runner.Run([inputs])
        preds = {
            "maps": list(outputs.values())[0]
        }
        results = self.det_post(preds, shape_list)
        for row in results[0]["points"]:
            print(row.tolist())

    def test_db_c(self):
        image = cv2.imread("images/d1.jpg")
        inputs, shape_list = self.det_pre_fd.run(image[np.newaxis, :, :, :])
        print(f"shape list:{shape_list}")
        inputs = inputs[0].numpy()
        outputs = self.det_runner.Run([inputs])
        preds = list(outputs.values())[0].copy()
        results = self.det_post_fd.run([preds,], shape_list)
        for row in results[0]:
            print(row)

输出如下：

python PaddleOCR

shape_list [1439.0, 1528.0, 0.6226546212647672, 0.6282722513089005]

[[564, 1040], [726, 1023], [730, 1062], [568, 1079]]
[[708, 1019], [917, 1008], [919, 1048], [710, 1059]]
[[292, 1001], [537, 994], [538, 1052], [294, 1059]]
[[1122, 1002], [1316, 999], [1317, 1037], [1123, 1041]]
[[296, 940], [535, 933], [537, 989], [297, 996]]
[[959, 942], [1097, 934], [1099, 974], [961, 982]]
[[658, 929], [847, 921], [849, 971], [660, 979]]
[[1126, 919], [1264, 913], [1266, 958], [1128, 964]]
[[314, 891], [532, 874], [536, 926], [318, 942]]
[[561, 877], [744, 870], [745, 909], [563, 916]]
[[759, 858], [1058, 849], [1059, 888], [760, 897]]
[[294, 819], [535, 814], [537, 876], [295, 882]]
[[563, 792], [735, 792], [735, 830], [563, 830]]
[[278, 704], [925, 685], [927, 734], [279, 752]]
[[278, 544], [544, 527], [547, 577], [281, 593]]
[[856, 466], [1185, 457], [1186, 502], [857, 511]]
[[281, 465], [664, 452], [666, 502], [283, 514]]
[[341, 389], [1258, 395], [1257, 450], [340, 443]]
[[1031, 45], [1197, 45], [1197, 191], [1031, 191]]

fastdeploy
shape_list [[1528, 1439, 960,               896]]
              0.6282722513089005  0.6226546212647672

[563, 1040, 725, 1023, 730, 1061, 568, 1079]
[708, 1018, 916, 1008, 918, 1048, 709, 1058]
[292, 1000, 536, 994, 537, 1051, 294, 1058]
[1122, 1002, 1316, 998, 1316, 1037, 1122, 1040]
[296, 939, 534, 933, 536, 989, 297, 995]
[959, 941, 1096, 934, 1098, 974, 961, 981]
[658, 928, 846, 921, 848, 971, 660, 978]
[1126, 918, 1263, 913, 1265, 958, 1128, 963]
[313, 891, 531, 873, 536, 925, 318, 942]
[561, 876, 743, 870, 744, 909, 563, 915]
[736, 880, 751, 880, 751, 886, 736, 886]
[759, 857, 1058, 849, 1058, 888, 759, 896]
[294, 819, 534, 814, 536, 876, 296, 881]
[563, 791, 735, 791, 735, 830, 563, 830]
[278, 703, 924, 685, 926, 733, 280, 751]
[273, 610, 913, 594, 915, 660, 275, 676]
[276, 544, 544, 526, 547, 576, 280, 594]
[574, 538, 588, 538, 588, 552, 574, 552]
[856, 465, 1184, 457, 1185, 502, 857, 510]
[281, 464, 663, 452, 665, 502, 283, 513]
[340, 388, 1257, 395, 1257, 449, 340, 443]
[1031, 44, 1196, 44, 1196, 191, 1031, 191]

经过比对，模型推理后的结果一致，但是后处理的结果不一致，如上，564 和 563 ，这是为什么，是因为c++中的精度差异吗？求教

Apr 06 '25 02:04 huotong1212

后处理中看有没有直接用int强转float的，这样会丢掉小数部分，有些后处理用了round方法，先四舍五入，再转成int。我觉得这俩做法都不会影响最终结果。

Apr 25 '25 08:04 ChaoII