supervision [DetectionDataset] - extend `from_coco` and `as_coco` with support for masks in RLE format

Description

The COCO dataset format allows for the storage of segmentation masks in two ways:

Polygon Masks: These masks use a series of vertices on an x-y plane to represent segmented object areas. The vertices are connected by straight lines to form polygons that approximate the shapes of objects. Run-Length Encoding (RLE): RLE compresses segments of pixels into counts of consecutive pixels (runs). This method efficiently sequences pixels by reporting the number of pixels that are either foreground or background. For instance, starting from the top left of an image, the encoding might record '5 white pixels, 3 black pixels, 6 white pixels', and so on.

Supervision currently only supports Polygon Masks, but we want to expand support for masks in RLE format. To do this, you will need to make changes in coco_annotations_to_detections and detections_to_coco_annotations.

Links

an official explanation from the COCO dataset repository
old supervision issue providing more context

Additional

Note: Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏🏻

Apr 12 '24 15:04 LinasKo

Hi, great job with supervision! I would like to try contributing, but first I have some doubts regarding this issue. According to coco data format the RLE format is used only when iscrowd=1 but I don't see this attribute in Detections class so, does this mean that the idea is to select one type of segmentation storage when loading or saving annotations? Thanks in advance!

May 01 '24 17:05 David-rn

Hi @David-rn 👋🏻 I'm really glad you want to contribute to supervision. It's true that detections don't have an iscrowd field, and we don't need it but don't worry about that for now. Let's start with implementing two functions:

mask_to_rle, which would convert a 2D boolean np.array into a valid RLE (Run-Length Encoding) representation.
rle_to_mask, which converts an RLE representation back into a 2D boolean np.array.

In the second phase, we'll discuss how to integrate them into methods as_coco and from_coco. Does that sound okay?

May 02 '24 12:05 SkalskiP

Hi @SkalskiP, sure! It does sound okay!

May 02 '24 19:05 David-rn

@David-rn I'm excited. Let me know if you need any help. I'm assigning this issue to you. Good luck!

May 03 '24 09:05 SkalskiP

Hi, I have not noticed that someone is interested in this issue and already have done some work (encode/decode function+modification to coco_annotations_to_detections+tests).

If you have not started implementation @David-rn I would be happy to complete this task.

May 03 '24 11:05 emsko

Hi @emSko , I was about to start but I'm not sure if there is any procedure in this situation. @SkalskiP how is this normally handled?

May 03 '24 15:05 David-rn

Hi @David-rn, thanks, and sorry for the mix-up.

SAVING DETECTIONS: @SkalskiP I have opened the PR for this issue. It is still a work in progress since I am not sure how to handle saving detections to coco format. For sure the additional flag is needed and I have added a comment in the place where I think it is appropriate. But where and how to pass it?

I have 2 ideas for that however they are quite limiting.

Add optional bool parameter to as_coco method that determines if ALL detections should be saved in RLE format or not. The same format will be applied to all images in the dataset.
Add optional list parameter to as_coco method that allows to specify which images (by filenames) will have masks saved in RLE format. All annotations for a single image will have the same format (either RLE or polygon).

I am not sure how to specify the mask format on a single annotation/detection level. Do you have any other ideas for this problem?

COLAB NOTEBOOK: Here is the colab notebook where the encoding/decoding scheme is tested. Is that the kind of notebook that is sufficient to test a new feature (requirement mentioned in the first comment)? Could you provide an example of what would you expect?

May 05 '24 17:05 emsko

Hi @David-rn and @emSko 👋🏻

I'm very sorry to see such confusion. I've thought about how to resolve this situation, and especially considering that @David-rn hasn't started working on the task yet and @emSko already has a draft PR open, we will allow @emSko to continue working on this task.

@David-rn, I'm very sorry, and I hope you won't hold it against me. 🙏🏻 If you had already started writing your code, I would have given you priority since you were already assigned to the task. Is there another task in our backlog that you might be interested in?

@emSko, standard practice in larger open-source projects is to inform under the GH issue that you want to work on the task, which helps avoid such misunderstandings. As for your PR, it's a good start! 🔥 I have already left comments under it.

May 06 '24 11:05 SkalskiP

Hi @SkalskiP 👋 . Don't worry about the confusion! I have checked the backlog and if I'm not wrong seems that all tasks are assigned or someone is already working on it. So if any other new issue arises I'll be glad to help 👍

May 06 '24 19:05 David-rn

@SkalskiP In the future, I will definitely do so. Lesson learned.

Once again @David-rn I'm really sorry about that situation.

May 06 '24 22:05 emsko

This issue was just implemented via https://github.com/roboflow/supervision/pull/1163. I'm closing the issue.

May 21 '24 11:05 SkalskiP