camelot icon indicating copy to clipboard operation
camelot copied to clipboard

if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8: ZeroDivisionError: float division by zero

Open arjungandeeva opened this issue 1 year ago • 6 comments

I'm encountering a ZeroDivisionError: float division by zero error in camelot-py when using the functions bbox_intersection_area and bbox_area. This error occurs under certain conditions, likely when the bounding box area (ba) is zero.

arjungandeeva avatar Apr 04 '24 05:04 arjungandeeva

I did a quick fix/hack to circumvent the error by skipping over the area check if ba is singular (area is zero):

~/.pyenv/versions/3.11.3/lib/python3.11/site-packages/camelot/utils.py: Line 375:

            if bbox_area(ba) > 0 and bbox_intersect(ba, bb):
                # if the intersection is larger than 80% of ba's size, we keep the longest
                if (bbox_intersection_area(ba, bb) / bbox_area(ba)) > 0.8:
                    if bbox_longer(bb, ba):
                        rest.discard(ba)

cktse avatar Apr 10 '24 12:04 cktse

Hey!

As https://github.com/camelot-dev/camelot/issues/343, we try to build a maintained fork at pypdf_table_extraction.

Do you want to check that code and open an issue / PR thereto include this fix?

bosd avatar Aug 06 '24 12:08 bosd

I just took another look at the branches -- looks like this has already been fixed as part of "Release camelot-fork 0.20.1", which is already included in your fork: Release camelot-fork 0.20.1

cktse avatar Aug 12 '24 09:08 cktse

Thanks for checking 👍

bosd avatar Aug 12 '24 11:08 bosd

Great to see camelot lives on!

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

cktse avatar Aug 12 '24 11:08 cktse

BTW is this fork going to be packaged on pip under a separate name? Think the current package is stale from the main branch.

Yes, it is published here https://pypi.org/project/pypdf-table-extraction/

We're currently working on a new release, bymerging the open pr's from this repo, and rebranding the package.

bosd avatar Aug 12 '24 17:08 bosd