img2dataset icon indicating copy to clipboard operation
img2dataset copied to clipboard

Support for bounding box cropping

Open vanga opened this issue 2 years ago • 4 comments

This was originally initiated as part of https://github.com/rom1504/img2dataset/pull/233 https://github.com/rom1504/img2dataset/pull/241 has been implemented since then.

This issue is just to document the spec and implementation more formally and get feedback We will introduce a new flag called --bbox_operation which defaults to BLUR and a new implementation for CROP will be added. The format of the bounding box values will be the same as existing which [x_min, y_min, x_max, y_max].

Right now, bbox computation seems to be happening in the middle of resize operations

For bounding box cropping, it has been recommended by @rom1504 to do the cropping before any resize happens.

So, the question I have is, should we call the existing blurring logic as is and do the bbox processing before resizing code? which is before this https://github.com/rom1504/img2dataset/blob/fc3fb2eeb3b4b0c352f4dff622440b5c13100312/img2dataset/resizer.py#L178

Or, should we call the blurring also before any resize operations?

vanga avatar Feb 02 '23 12:02 vanga

cc @GeorgiosSmyrnis

vanga avatar Feb 02 '23 12:02 vanga

@GeorgiosSmyrnis was it intentional that you had to call blurring between these two transformations? https://github.com/rom1504/img2dataset/blob/fc3fb2eeb3b4b0c352f4dff622440b5c13100312/img2dataset/resizer.py#L182-L186

vanga avatar Feb 02 '23 12:02 vanga

Hello!

The positioning for this was indeed intentional. The code that performs blurring depends on the size of the bounding box in pixels (to properly calculate the size of the kernel) so blurring after resizing is needed. At the same time, cropping/padding the image also alters the bounding box coordinates, so blurring needs to happen before cropping/padding the image.

For supporting a CROP use of bounding boxes, I agree that it's proabably best to happen after resizing, in contrast to the BLUR use. The bounding boxes themselves are already available before resizing.

So I think the best way to go about this is to have the two uses of bounding boxes be separate and in different positions in the code.

GeorgiosSmyrnis avatar Feb 02 '23 19:02 GeorgiosSmyrnis

Thanks @GeorgiosSmyrnis, We can keep the blurring flow as is. w.r.t CROP, @rom1504 is of the opinion that it should happen before any resize happens and the bounding box dimensions should be as per the original image dimensions.

@rom1504 could you please confirm?

vanga avatar Feb 06 '23 02:02 vanga