camelot icon indicating copy to clipboard operation
camelot copied to clipboard

fix float division by zero

Open tuyenta opened this issue 3 years ago • 1 comments

In some cases, the text area is 0. For this case, the function text_in_bbox(bbox, text) can be crashed due to the logic: if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8: since bbox_area(ba)=0.

Thus, i update the logic by the new condition: if (bbox_area(ba)>0) and ((bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8) which can handle such a case.

tuyenta avatar Feb 23 '22 10:02 tuyenta

Hey!

As camelot is dead, we try to build a maintained fork at pypdf_table_extraction.

Do you want to open the PR against that branch so that we can merge your improvement?

MartinThoma avatar Feb 25 '24 11:02 MartinThoma

Thanks for your contribution. This has been merged in adifferent pr.

bosd avatar Dec 29 '24 12:12 bosd