sahi icon indicating copy to clipboard operation
sahi copied to clipboard

Change Mask to use coco-style segmentation by default

Open mayrajeo opened this issue 1 year ago • 4 comments

I suggest that instead of numpy arrays, masks are saved either as shapely polygons or COCO-style annotations. The reason for this is that for large images, such as Sentinel 2 satellite tiles (10980x10980 pixels), each time mask is shifted and Mask.get_shifted_mask() is called, sahi creates a new empty array that is same size that the original image. This takes a lot of time compared to shifting just the coordinates with

    def get_shifted_mask(self):
        # Confirm full_shape is specified
        if (self.full_shape_height is None) or (self.full_shape_width is None):
            raise ValueError("full_shape is None")
        shifted_segmentation = []
        for s in self.segmentation:
            xs = [min(self.shift_x + s[i], self.full_shape_width) for i in range(0, len(s) - 1, 2)]
            ys = [min(self.shift_y + s[i], self.full_shape_height) for i in range(1, len(s), 2)]
            shifted_segmentation.append([j for i in zip(xs, ys) for j in i])
        return Mask(
            segmentation=shifted_segmentation,
            shift_amount=[0, 0],
            full_shape=self.full_shape,
        )

For example, in my use case I'm slicing 10980x10980 tiles into 320x320px slices with overlap of 0.2, so around 1850ish tiles. Shifting each predicted mask took around one second, and these images can contain several hundreds if not over two thousand objects of interest, meaning that just shifting the masks will take several times longer than getting the actual predictions.

Merging might needs a bit of fixing, as I'm not really happy how the code looks but it should work like current implementation. I briefly considered using rle for this but they require boolean masks first and creating them is slow.

Just for the note: .buffer(0) is a quickhack to fix somehow invalid polygons, and if-else ensures that only polygons (not lines or points) are used to construct MultiPolygon.

def get_merged_mask(pred1: ObjectPrediction, pred2: ObjectPrediction) -> Mask:
    mask1 = pred1.mask
    mask2 = pred2.mask

    poly1 = get_shapely_multipolygon(mask1.segmentation).buffer(0)
    poly2 = get_shapely_multipolygon(mask2.segmentation).buffer(0)
    union_poly = poly1.union(poly2)
    if not hasattr(union_poly, 'geoms'):
        union_poly = MultiPolygon([union_poly])
    else:
        union_poly = MultiPolygon([g.buffer(0) for g in union_poly.geoms if isinstance(g, Polygon)])
    union = ShapelyAnnotation(multipolygon=union_poly).to_coco_segmentation()
    return Mask(
        segmentation=union,
        full_shape=mask1.full_shape,
        shift_amount=mask1.shift_amount,
    )

I hope that I found all places that this change affects. For models, this means constructing ObjectPrediction with segmentation instead of bool_mask. Another way is create a method ObjectPrediction.from_bool_mask and use it.

mayrajeo avatar Jun 05 '23 10:06 mayrajeo

When I do instance segmentation prediction, the code reports an error. But my prediction using the official version was successful:

Traceback (most recent call last): File "D:/pytorch1.7.1/mmdetection-3.x/sahi_batch_for_stone.py", line 29, in predict( File "D:\pytorch1.7.1\mmdetection-3.x\sahi\predict.py", line 518, in predict detection_model.load_model() File "D:\pytorch1.7.1\mmdetection-3.x\sahi\models\mmdet.py", line 39, in load_model self.set_model(model) File "D:\pytorch1.7.1\mmdetection-3.x\sahi\models\mmdet.py", line 54, in set_model category_mapping = {str(ind): category_name for ind, category_name in enumerate(self.category_names)} File "D:\pytorch1.7.1\mmdetection-3.x\sahi\models\mmdet.py", line 103, in category_names if type(self.model.CLASSES) == str: File "d:\anaconda3\envs\wssis\lib\site-packages\torch\nn\modules\module.py", line 1130, in getattr raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'CascadeRCNN' object has no attribute 'CLASSES'

Process finished with exit code 1

W-hary avatar Nov 04 '23 21:11 W-hary

Finally got time to work on this. Only breaking change that I found is that each time we use detection_model.get_prediction() or detection_model.convert_original_predictions with mask output, full_shape must be included as it is required for mask shifting.

This also adds yolov8 segmentation support, mostly similar than in #918

mayrajeo avatar Mar 19 '24 11:03 mayrajeo

This code still uses bounding boxes in the post processing to merge the tiles, this results in cutoff masks since masks inside a box might be removed. A better approach would be to look at the masks polygon and calculate IoU etc from those. Here is a paper: https://isprs-annals.copernicus.org/articles/V-2-2022/291/2022/ which also includes some examples of the current solutions problem.

Preburk avatar Apr 19 '24 13:04 Preburk

@Preburk good point, probably should try to implement NMS for polygons. Using shapely should work cause IoU is just

def poly_IoU(poly_1:Polygon, poly_2: Polygon) -> float:
    area_intersection = poly_1.intersection(poly_2).area
    area_union = poly_1.union(poly_2).area
    iou = area_intersection / area_union
    return iou

Though have to more accurately check how merging the slices is handled in sahi.

mayrajeo avatar Apr 23 '24 06:04 mayrajeo

Shapely is too slow; one should first use bounding boxes to find all intersecting targets, and then use Shapely

lansfair avatar May 16 '24 03:05 lansfair

Hey @mayrajeo this pr's commit history contains uncompressed large file commits as https://github.com/obss/sahi/blob/e114ee74a6ff4795ad62ab5a28c7244c49ae1d20/demo/demo_data/prediction_visual.png, can you please reopen a new pr without the large image file? I will gladly accept your PR, it looks very promising 💯

fcakyon avatar May 20 '24 01:05 fcakyon

Continues in #1039

mayrajeo avatar May 20 '24 10:05 mayrajeo