augraphy icon indicating copy to clipboard operation
augraphy copied to clipboard

Skewed bounding boxes

Open JKrivec opened this issue 1 year ago • 4 comments

Hello,

When supplying the bounding boxes, I noticed that the degraded bounding boxes are not what I really imagined them to be.

image

The red text and the bounding boxes are what I pulled out of the original pdf, before degrading using Augraphy. Shouldn't the degraded boxes be the ones I outlined in blue, so the whole original object is outlined?

This is even more obvious when you look at it at the larger scale:

image

The larger bounding box around the table is supposed to encompass the table, but here we can see that some of the textual boxes are now outside of the actual area of interest.

Is this a feature or a bug?

JKrivec avatar Jul 17 '24 17:07 JKrivec

Hi, so as in the documentation, only the start point and end point of the box are affected:

https://augraphy.readthedocs.io/en/latest/doc/source/augmentations/folding.html

So this should be consistent with your observation?

kwcckw avatar Jul 18 '24 01:07 kwcckw

Yes, this is exactly what is stated in the docs, so I guess this is a feature, not a bug :).

Hovewer the second image I uploaded is a mix of Geometric and Folding, and with the larger bounding box, this is mostly an "issue" with the rotation. I would say that the correct way would be rotating the bounding box, then getting the bottom left and the top right coordinate and using that as the new bounding box.

If you were to label the table in the second image, where would you put the rectangle? I think you want to encompass the whole object. I am not a computer vision specialist, so I'm not sure what the correct way is, so this issue is maybe just opening a debate how the bounding box computation should be approached

JKrivec avatar Jul 18 '24 09:07 JKrivec

Right, there should be a better solution to this problem. For example, for clockwise rotation, it should take top-left and bottom-right of the box, while for anticlockwise rotation (your example), it should take top-right and bottom-left of the box.

Thanks for pointing this out. So probably you can submit a pull request too if you are able to create a better alternative to address on this problem.

kwcckw avatar Jul 18 '24 13:07 kwcckw

Yeah, the bounding box should just be the (min(all_x), min(all_y), max(all_x), max(all_y)) in my opinion. Im currently very low on time, but I might give it a crack in a few months! Feel free to close this, thanks :)

JKrivec avatar Jul 18 '24 14:07 JKrivec