PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Any option to get confidence score and bounding box with the html string for pp-structure ?

Open arhamshah opened this issue 2 years ago • 4 comments

I am using pp-structure library from paddleocr to extract tables from images. image I get this result for above image:-

result = {'type': 'table', 'bbox': [0, 0, 349, 71], 'res': {'cell_bbox': [[2.1059176921844482, 0.7967990636825562, 54.594913482666016, 0.9006235599517822, 53.41596984863281, 16.244869232177734, 1.93153977394104, 15.716910362243652], [62.804359436035156, 0.7160009145736694, 138.24855041503906, 0.786230206489563, 135.5400390625, 13.393481254577637, 60.11547088623047, 13.050923347473145], [98.69075775146484, 0.7997479438781738, 329.5711669921875, 0.8572822213172913, 328.5466613769531, 17.500146865844727, 94.86428833007812, 17.191831588745117], [0.9269500970840454, 17.537534713745117, 49.044742584228516, 17.845354080200195, 48.904449462890625, 41.31473159790039, 0.9105174541473389, 41.0366325378418], [43.86700439453125, 17.503067016601562, 94.20509338378906, 17.736175537109375, 93.4029541015625, 40.03254318237305, 43.05289077758789, 39.972354888916016], [94.61170959472656, 17.709407806396484, 345.9893798828125, 18.13838768005371, 345.9151306152344, 43.062191009521484, 93.54864501953125, 42.82071304321289], [0.9099741578102112, 41.6998405456543, 45.47168731689453, 41.82848358154297, 45.39813995361328, 67.85848999023438, 0.9060577154159546, 67.83340454101562], [44.603145599365234, 41.61592483520508, 92.34540557861328, 41.64745330810547, 91.45874786376953, 66.62190246582031, 43.83229064941406, 66.60901641845703], [92.18399810791016, 41.781700134277344, 344.60528564453125, 41.8816032409668, 344.576171875, 66.88459777832031, 92.13319396972656, 66.8738784790039]], 'boxes': array([[  0.,   2.,  48.,  25.],
       [ 91.,   2., 276.,  24.],
       [  0.,  24.,  49.,  50.],
       [ 53.,  24.,  72.,  46.],
       [ 91.,  28., 347.,  49.],
       [  1.,  50.,  49.,  69.],
       [ 92.,  51., 340.,  69.]]), 'rec_res': [('E3509', 0.8876606822013855), ('Practice Work Mathematics-Std.9', 0.9079822301864624), ('E3518', 0.9605749249458313), ('1', 0.96055477752685547), ('PracticeWorkMathematics(Basic)Std.1', 0.8757174611091614), ('E3519', 0.791429877281189), ('PracticeWorkMathematics(Standard)Std1', 0.8527902364730835)], 'html': '<html><body><table><tr><td>E3509</td><td></td><td>Practice Work Mathematics-Std.9</td></tr><tr><td>E3518</td><td>工</td><td>PracticeWorkMathematics(Basic)Std.1</td></tr><tr><td>E3519</td><td></td><td>PracticeWorkMathematics(Standard)Std1</td></tr></table></body></html>'}, 'img_idx': 0}

I want to create a dataframe which I did by using pd.read_html(). image

I want to attach ocr confidence scores and bounding boxes to this dataframe. I did try appending the list as a column to the dataframe but the sequence of cells are not consistent. Any provision to attach the following things to the given dataframe? Expected Output:- image

arhamshah avatar Sep 14 '22 07:09 arhamshah

@MissPenguin help me on this

arhamshah avatar Sep 14 '22 08:09 arhamshah

you can get ocr box text and scores by add return_ocr_result_in_table=True when call table_engine, such as

table_engine = PPStructure(show_log=True)

save_folder = './output'
img_path = '1.png'
img = cv2.imread(img_path)
result = table_engine(img,return_ocr_result_in_table=True)

then, you will see result like this image

WenmuZhou avatar Sep 15 '22 02:09 WenmuZhou

@WenmuZhou I did the same, but how should I append to the html string. Issue is the 'boxes' and 'rec_res'are not sequential i.e. they don't follow the table structure but 'html' does so I need to append those results to the resultant html string.

arhamshah avatar Sep 15 '22 05:09 arhamshah

@WenmuZhou Have a look at this result carefully

result = {'type': 'table', 'bbox': [0, 0, 430, 176], 'res': {'cell_bbox': [[2.9711382389068604, 2.527280330657959, 94.23353576660156, 2.7095680236816406, 93.45491027832031, 44.173255920410156, 2.7896084785461426, 43.96155548095703], [81.72722625732422, 2.226247787475586, 191.076171875, 2.3149428367614746, 191.9145050048828, 45.636451721191406, 81.57770538330078, 45.99460983276367], [166.70916748046875, 1.859868049621582, 425.1251525878906, 1.9312458038330078, 425.0083312988281, 42.467620849609375, 166.3208770751953, 42.756195068359375], [4.674074172973633, 41.24868392944336, 87.30780029296875, 41.453792572021484, 88.2344970703125, 84.01764678955078, 4.671993732452393, 83.96307373046875], [79.00847625732422, 42.8923454284668, 174.6012725830078, 42.909637451171875, 175.84304809570312, 84.50270080566406, 79.2442626953125, 84.84281921386719], [171.07443237304688, 43.35968017578125, 424.994873046875, 43.763336181640625, 424.918212890625, 86.53250122070312, 170.34542846679688, 86.52181243896484], [3.6962084770202637, 85.94416809082031, 88.9952163696289, 86.06139373779297, 89.80284881591797, 126.8577651977539, 3.71451473236084, 126.86785888671875], [78.10800170898438, 87.54734802246094, 172.7854766845703, 87.60906982421875, 173.79161071777344, 126.6048583984375, 78.3547592163086, 126.73477172851562], [166.9349365234375, 86.76020050048828, 424.0592346191406, 87.11247253417969, 424.1020202636719, 125.24484252929688, 168.2081298828125, 125.27180480957031], [2.341264247894287, 126.57464599609375, 90.25379943847656, 126.29664611816406, 89.6300048828125, 174.189208984375, 2.30151104927063, 174.1785125732422], [77.24783325195312, 128.11798095703125, 180.08106994628906, 127.7411117553711, 180.03985595703125, 173.86538696289062, 76.86996459960938, 173.87294006347656], [169.75401306152344, 128.0176239013672, 424.7585754394531, 127.8905029296875, 424.78594970703125, 172.87400817871094, 171.3323211669922, 172.8780059814453]], 'boxes': array([[176.,   9., 248.,  34.],
       [  6.,  12.,  81.,  34.],
       [ 99.,  12., 149.,  39.],
       [  6.,  54.,  83.,  79.],
       [ 91.,  52., 144.,  83.],
       [178.,  54., 292.,  75.],
       [  7.,  96.,  82., 122.],
       [ 94.,  91., 141., 122.],
       [179.,  96., 348., 117.],
       [  7., 141.,  83., 162.],
       [ 92., 136., 144., 166.],
       [178., 136., 419., 161.]]), 'rec_res': [('English', 0.889473557472229), ('C3001', 0.9285628199577332), ('55', 0.8031783103942871), ('C 3002', 0.883321225643158), ('55', 0.9964786767959595), ('Mathematics', 0.9479299187660217), ('C 3003', 0.9886908531188965), ('55', 0.9862130880355835), ('General Knowledge', 0.9595227241516113), ('C 3004', 0.974701464176178), ('55', 0.8086638450622559), ('Drawing (For Annual Exam)', 0.9427326917648315)], 'html': '<html><body><table><tr><td>C3001</td><td>55</td><td>English</td></tr><tr><td>C 3002</td><td>55</td><td>Mathematics</td></tr><tr><td>C 3003</td><td>55</td><td>General Knowledge</td></tr><tr><td>C 3004</td><td>55</td><td>Drawing (For Annual Exam)</td></tr></table></body></html>'}, 'img_idx': 0}

When parsing this, result['res']['boxes'] has first element ('English', 0.889473557472229) but the result['res']['html'] has first element <td>C3001</td>, hence asking for any solution to merge while in such cases.

arhamshah avatar Sep 15 '22 05:09 arhamshah

They cannot currently be matched, so your needs may not be met

WenmuZhou avatar Oct 10 '22 07:10 WenmuZhou

Hey, @arhamshah any workaround you found in this issue?

parthplc avatar Dec 12 '22 05:12 parthplc