Description

PR adds support for the RLE format in the COCO dataset. Based on #1114 .

It required implementing RLE encode/decode functions and extending the coco_annotations_to_detections/detections_to_coco_annotations functions.

Type of change

Please delete options that are not relevant.

[x] New feature (non-breaking change which adds functionality)
[x] This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

All functionalities are covered with unit tests + notebook checks the functionalities on larger images.

Any specific deployment considerations

None

Docs

[x] Docs updated? What were the changes: docs/datasets/utils.md

May 05 '24 11:05 emsko

All committers have signed the CLA.

May 05 '24 11:05 CLAassistant

Hi @emSko 👋🏻 Could you accept CLA? Without this, we won't be able to merge your PR.

May 06 '24 09:05 SkalskiP

Let's split the work on this PR into three parts:

[x] Implementation of rle_to_mask and mask_to_rle. Please write valid docstrings in Google format for both methods. For example, take a look at other documented functions in our repository here. The documentation should describe the input and output arguments, as well as provide a simple usage example. Regarding the tests, create several cases: when the image is entirely zeros, when the image is entirely ones, when the image contains a mask with a hole, when the image contains a disjoined mask consisting of multiple elements. Use pytest.mark.parametrize as demonstrated in this test.
- [x] code
- [x] documentation
- [x] unit tests
[ ] Integration of rle_to_mask method into sv.DetectionDataset.from_coco logic.
- [x] code
- [x] documentation updates
[ ] Integration of mask_to_rle method into sv.DetectionDataset.as_coco logic.
- [x] code
- [x] documentation updates

May 06 '24 10:05 SkalskiP

@SkalskiP what do you mean by the documentation? Only docstrings or are there additional files? Where to find them?

May 06 '24 22:05 emsko

Hi @emSko 👋🏻 Thank you very much for dedicating time to this task. In the meantime, I did some light research and found these two videos on YouTube:

https://www.youtube.com/watch?v=tScp754b8iU
https://www.youtube.com/watch?v=2zCZXZOoQJQ

I compared the results of our implementation of encoding and decoding as well as the execution time. The outcomes are identical; however, the execution time, especially for encoding, is significantly slower. Could we utilize similar tricks to accelerate the encoding process? Here's the Colab I used.

https://colab.research.google.com/drive/1Yl3w0B12htaso_VxZ-S2LvvxftiRk-ta?usp=sharing

May 08 '24 11:05 SkalskiP

@SkalskiP what do you mean by the documentation? Only docstrings or are there additional files? Where to find them?

I would like the mask_to_rle and rle_to_mask functions to be included here. To make this happen, you need to add the appropriate changes to this file.

May 08 '24 11:05 SkalskiP

Hi @SkalskiP
Performance: I have changed both rle_to_mask and mask_to_rle functions. I was able to develop even faster solution to encode the mask to RLE. For decoding RLE to the mask, I applied tricks from your code.

You can find the comparison at the bottom of this notebook.

Documentation: In your comment, you have pointed in your comment to detection/utils.md. I placed my functions in the dataset package (utils.py). I figured since Detections are operating on masks (images) and the RLE functionality is used only when reading/writing to a dataset then it would be better to place those functions in the dataset package. Please let me know if you want me to: a) move functions to the detection package and modify detection/utils.md. b) modify detection/utils.md without changing existing code. c) change a different documentation file.

Integration to as_coco: It is one of the last things to do. I have proposed 2 possible solutions here: comment Can you say if any of them sounds ok, and if not what kind of solution do you have in mind?

May 12 '24 22:05 emsko

Hi @emSko 👋🏻, I like your work with rle_to_mask and mask_to_rle. The code, tests, and docks look great. Thank you! 🙏🏻 I just made a few small changes (I didn't want to bother you with those small comments).

I also like the updates you made in coco_annotations_to_detections; now it's time to invest a bit of time into the test_coco_annotations_to_detections function.

please update mock_cock_coco_annotation so that it will be able to take Optional[rle] argument.
use mock_cock_coco_annotation in your test cases

please make your tests easy to follow; for example instead of this:

Detections(
    xyxy=np.array([[0, 0, 10, 10]], dtype=np.float32),
    class_id=np.array([0], dtype=int),
    mask=np.array(
        [
            0 if i >= 10 or j >= 10 or (i < 5 and j >= 5) else 1
            for i in range(0, 20)
            for j in range(0, 20)
        ]
    ).reshape((1, 20, 20)),
),

Try using a smaller mask and define it in place:

array([[[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]])

It is one of the last things to do. I have proposed 2 possible solutions here: https://github.com/roboflow/supervision/issues/1114#issuecomment-2094889036 Can you say if any of them sounds ok, and if not what kind of solution do you have in mind?

I'd like to propose a smarter solution. Instead of asking users to provide additional arguments that trigger RLE, we should figure it out ourselves. When do we need RLE?

When the mask contains holes.
When the mask is detached.

I think we should write functions that validate those cases and trigger mask_to_rle if one of those two is True.

May 13 '24 10:05 SkalskiP

Hi @emSko 👋🏻 I like the updates you made in test_coco_annotations_to_detections. Now is time to detections_to_coco_annotations updates. As I said, it would be awesome if we could deliver a smart solution that uses RLE if needed. Here's a result of quick ChatGPT conversation:

import numpy as np
import cv2

def has_holes(mask):
    """
    Check if a 2D boolean mask has any holes (groups of False values completely surrounded by True values).
    
    Args:
        mask (np.ndarray): 2D boolean array.
    
    Returns:
        bool: True if the mask has holes, False otherwise.
    """
    # Convert boolean mask to uint8
    mask_uint8 = mask.astype(np.uint8)
    
    # Invert the mask
    inverted_mask = cv2.bitwise_not(mask_uint8)
    
    # Find contours
    contours, _ = cv2.findContours(inverted_mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    
    # Check for any internal contours, which indicate holes
    for i in range(len(contours)):
        if _[0][i][3] != -1:
            return True
    
    return False

# Example usage
mask = np.array([[True, True, True, True],
                 [True, False, False, True],
                 [True, True, True, True]], dtype=bool)
print(has_holes(mask))  # Output: True

def has_multiple_unconnected_sections(mask):
    """
    Check if a 2D boolean mask has multiple unconnected sections of True values.
    
    Args:
        mask (np.ndarray): 2D boolean array.
    
    Returns:
        bool: True if the mask has multiple unconnected sections, False otherwise.
    """
    # Convert boolean mask to uint8
    mask_uint8 = mask.astype(np.uint8)
    
    # Find connected components
    num_labels, _ = cv2.connectedComponents(mask_uint8)
    
    # More than one label means multiple unconnected sections
    return num_labels > 2  # 0 is the background label, so we check for > 2

# Example usage
mask = np.array([[True, False, False, True],
                 [False, False, False, False],
                 [True, True, False, False]], dtype=bool)
print(has_multiple_unconnected_sections(mask))  # Output: True

If has_holes or has_multiple_unconnected_sections is True we should auto RLE.

May 15 '24 14:05 SkalskiP

@SkalskiP Hi, I need help with tests. I cannot reproduce the segmentation fault from here on my local machine. It seems that it is caused by cv2.connectedComponents(mask_uint8, connectivity=4)

May 16 '24 07:05 emsko

Hi @emSko :wave:

Do a git pull - I've added a line that fixes the segfault.

I couldn't reproduce it as well, but found this issue

We'll eventually update the opencv to >=4.6.0.0 which allegedly resolves it, but for now this should be sufficient.

May 17 '24 08:05 LinasKo

Hi @LinasKo, thank you for your help.

@SkalskiP I think the PR is ready for review. I don't think that from_coco requires any documentation updates.

May 17 '24 09:05 emsko

Hi @emSko 👋🏻 At the outset, I apologize for the late reply - I was traveling at the end of the week and was not on GitHub. @LinasKo, thanks for your support! 🙏🏻

@emSko you did an excellent job here! We are already on the last straight. I left some final comments.

May 20 '24 10:05 SkalskiP

Hi @emSko 👋🏻

I just tested your code in Google Colab. Everything seems to be working fine. I also made a few minor changes to the docs, including adding some visualizations. Everything looks great. Many thanks for this work. This is a massive upgrade for COCO dataset support. Working with you was a pleasure. Thank you! 🙏🏻

May 21 '24 11:05 SkalskiP

supervision
supervision copied to clipboard

1114 rle support for coco

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

supervision supervision copied to clipboard

1114 rle support for coco

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

supervision
supervision copied to clipboard