sahi
sahi copied to clipboard
Change Mask to use coco-style segmentation by default
I suggest that instead of numpy arrays, masks are saved either as shapely polygons or COCO-style annotations. The reason for this is that for large images, such as Sentinel 2 satellite tiles (10980x10980 pixels), each time mask is shifted and Mask.get_shifted_mask()
is called, sahi creates a new empty array that is same size that the original image. This takes a lot of time compared to shifting just the coordinates with
def get_shifted_mask(self):
# Confirm full_shape is specified
if (self.full_shape_height is None) or (self.full_shape_width is None):
raise ValueError("full_shape is None")
shifted_segmentation = []
for s in self.segmentation:
xs = [min(self.shift_x + s[i], self.full_shape_width) for i in range(0, len(s) - 1, 2)]
ys = [min(self.shift_y + s[i], self.full_shape_height) for i in range(1, len(s), 2)]
shifted_segmentation.append([j for i in zip(xs, ys) for j in i])
return Mask(
segmentation=shifted_segmentation,
shift_amount=[0, 0],
full_shape=self.full_shape,
)
For example, in my use case I'm slicing 10980x10980 tiles into 320x320px slices with overlap of 0.2, so around 1850ish tiles. Shifting each predicted mask took around one second, and these images can contain several hundreds if not over two thousand objects of interest, meaning that just shifting the masks will take several times longer than getting the actual predictions.
Merging might needs a bit of fixing, as I'm not really happy how the code looks but it should work like current implementation. I briefly considered using rle for this but they require boolean masks first and creating them is slow.
Just for the note: .buffer(0)
is a quickhack to fix somehow invalid polygons, and if-else ensures that only polygons (not lines or points) are used to construct MultiPolygon
.
def get_merged_mask(pred1: ObjectPrediction, pred2: ObjectPrediction) -> Mask:
mask1 = pred1.mask
mask2 = pred2.mask
poly1 = get_shapely_multipolygon(mask1.segmentation).buffer(0)
poly2 = get_shapely_multipolygon(mask2.segmentation).buffer(0)
union_poly = poly1.union(poly2)
if not hasattr(union_poly, 'geoms'):
union_poly = MultiPolygon([union_poly])
else:
union_poly = MultiPolygon([g.buffer(0) for g in union_poly.geoms if isinstance(g, Polygon)])
union = ShapelyAnnotation(multipolygon=union_poly).to_coco_segmentation()
return Mask(
segmentation=union,
full_shape=mask1.full_shape,
shift_amount=mask1.shift_amount,
)
I hope that I found all places that this change affects. For models
, this means constructing ObjectPrediction
with segmentation
instead of bool_mask
. Another way is create a method ObjectPrediction.from_bool_mask
and use it.
When I do instance segmentation prediction, the code reports an error. But my prediction using the official version was successful:
Traceback (most recent call last):
File "D:/pytorch1.7.1/mmdetection-3.x/sahi_batch_for_stone.py", line 29, in
Process finished with exit code 1
Finally got time to work on this. Only breaking change that I found is that each time we use detection_model.get_prediction()
or detection_model.convert_original_predictions
with mask output, full_shape
must be included as it is required for mask shifting.
This also adds yolov8
segmentation support, mostly similar than in #918
This code still uses bounding boxes in the post processing to merge the tiles, this results in cutoff masks since masks inside a box might be removed. A better approach would be to look at the masks polygon and calculate IoU etc from those. Here is a paper: https://isprs-annals.copernicus.org/articles/V-2-2022/291/2022/ which also includes some examples of the current solutions problem.
@Preburk good point, probably should try to implement NMS for polygons. Using shapely should work cause IoU is just
def poly_IoU(poly_1:Polygon, poly_2: Polygon) -> float:
area_intersection = poly_1.intersection(poly_2).area
area_union = poly_1.union(poly_2).area
iou = area_intersection / area_union
return iou
Though have to more accurately check how merging the slices is handled in sahi.
Shapely is too slow; one should first use bounding boxes to find all intersecting targets, and then use Shapely
Hey @mayrajeo this pr's commit history contains uncompressed large file commits as https://github.com/obss/sahi/blob/e114ee74a6ff4795ad62ab5a28c7244c49ae1d20/demo/demo_data/prediction_visual.png, can you please reopen a new pr without the large image file? I will gladly accept your PR, it looks very promising 💯
Continues in #1039