supervision icon indicating copy to clipboard operation
supervision copied to clipboard

1114 rle support for coco

Open emsko opened this issue 1 year ago • 12 comments

Description

PR adds support for the RLE format in the COCO dataset. Based on #1114 .

It required implementing RLE encode/decode functions and extending the coco_annotations_to_detections/detections_to_coco_annotations functions.

Type of change

Please delete options that are not relevant.

  • [x] New feature (non-breaking change which adds functionality)
  • [x] This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

All functionalities are covered with unit tests + notebook checks the functionalities on larger images.

Any specific deployment considerations

None

Docs

  • [x] Docs updated? What were the changes: docs/datasets/utils.md

emsko avatar May 05 '24 11:05 emsko

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar May 05 '24 11:05 CLAassistant

Hi @emSko 👋🏻 Could you accept CLA? Without this, we won't be able to merge your PR.

SkalskiP avatar May 06 '24 09:05 SkalskiP

Let's split the work on this PR into three parts:

  • [x] Implementation of rle_to_mask and mask_to_rle. Please write valid docstrings in Google format for both methods. For example, take a look at other documented functions in our repository here. The documentation should describe the input and output arguments, as well as provide a simple usage example. Regarding the tests, create several cases: when the image is entirely zeros, when the image is entirely ones, when the image contains a mask with a hole, when the image contains a disjoined mask consisting of multiple elements. Use pytest.mark.parametrize as demonstrated in this test.
    • [x] code
    • [x] documentation
    • [x] unit tests
  • [ ] Integration of rle_to_mask method into sv.DetectionDataset.from_coco logic.
    • [x] code
    • [x] documentation updates
  • [ ] Integration of mask_to_rle method into sv.DetectionDataset.as_coco logic.
    • [x] code
    • [x] documentation updates

SkalskiP avatar May 06 '24 10:05 SkalskiP

@SkalskiP what do you mean by the documentation? Only docstrings or are there additional files? Where to find them?

emsko avatar May 06 '24 22:05 emsko

Hi @emSko 👋🏻 Thank you very much for dedicating time to this task. In the meantime, I did some light research and found these two videos on YouTube:

  • https://www.youtube.com/watch?v=tScp754b8iU
  • https://www.youtube.com/watch?v=2zCZXZOoQJQ

I compared the results of our implementation of encoding and decoding as well as the execution time. The outcomes are identical; however, the execution time, especially for encoding, is significantly slower. Could we utilize similar tricks to accelerate the encoding process? Here's the Colab I used.

https://colab.research.google.com/drive/1Yl3w0B12htaso_VxZ-S2LvvxftiRk-ta?usp=sharing

SkalskiP avatar May 08 '24 11:05 SkalskiP

@SkalskiP what do you mean by the documentation? Only docstrings or are there additional files? Where to find them?

I would like the mask_to_rle and rle_to_mask functions to be included here. To make this happen, you need to add the appropriate changes to this file.

SkalskiP avatar May 08 '24 11:05 SkalskiP

Hi @SkalskiP
Performance: I have changed both rle_to_mask and mask_to_rle functions. I was able to develop even faster solution to encode the mask to RLE. For decoding RLE to the mask, I applied tricks from your code.

You can find the comparison at the bottom of this notebook.

Documentation: In your comment, you have pointed in your comment to detection/utils.md. I placed my functions in the dataset package (utils.py). I figured since Detections are operating on masks (images) and the RLE functionality is used only when reading/writing to a dataset then it would be better to place those functions in the dataset package. Please let me know if you want me to: a) move functions to the detection package and modify detection/utils.md. b) modify detection/utils.md without changing existing code. c) change a different documentation file.

Integration to as_coco: It is one of the last things to do. I have proposed 2 possible solutions here: comment Can you say if any of them sounds ok, and if not what kind of solution do you have in mind?

emsko avatar May 12 '24 22:05 emsko

Hi @emSko 👋🏻, I like your work with rle_to_mask and mask_to_rle. The code, tests, and docks look great. Thank you! 🙏🏻 I just made a few small changes (I didn't want to bother you with those small comments).

I also like the updates you made in coco_annotations_to_detections; now it's time to invest a bit of time into the test_coco_annotations_to_detections function.

  • please update mock_cock_coco_annotation so that it will be able to take Optional[rle] argument.

  • use mock_cock_coco_annotation in your test cases

  • please make your tests easy to follow; for example instead of this:

    Detections(
        xyxy=np.array([[0, 0, 10, 10]], dtype=np.float32),
        class_id=np.array([0], dtype=int),
        mask=np.array(
            [
                0 if i >= 10 or j >= 10 or (i < 5 and j >= 5) else 1
                for i in range(0, 20)
                for j in range(0, 20)
            ]
        ).reshape((1, 20, 20)),
    ),
    

    Try using a smaller mask and define it in place:

    array([[[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
            [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
            [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
            [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
            [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]])
    

It is one of the last things to do. I have proposed 2 possible solutions here: https://github.com/roboflow/supervision/issues/1114#issuecomment-2094889036 Can you say if any of them sounds ok, and if not what kind of solution do you have in mind?

I'd like to propose a smarter solution. Instead of asking users to provide additional arguments that trigger RLE, we should figure it out ourselves. When do we need RLE?

  • When the mask contains holes.
  • When the mask is detached.

I think we should write functions that validate those cases and trigger mask_to_rle if one of those two is True.

SkalskiP avatar May 13 '24 10:05 SkalskiP

Hi @emSko 👋🏻 I like the updates you made in test_coco_annotations_to_detections. Now is time to detections_to_coco_annotations updates. As I said, it would be awesome if we could deliver a smart solution that uses RLE if needed. Here's a result of quick ChatGPT conversation:

import numpy as np
import cv2

def has_holes(mask):
    """
    Check if a 2D boolean mask has any holes (groups of False values completely surrounded by True values).
    
    Args:
        mask (np.ndarray): 2D boolean array.
    
    Returns:
        bool: True if the mask has holes, False otherwise.
    """
    # Convert boolean mask to uint8
    mask_uint8 = mask.astype(np.uint8)
    
    # Invert the mask
    inverted_mask = cv2.bitwise_not(mask_uint8)
    
    # Find contours
    contours, _ = cv2.findContours(inverted_mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    
    # Check for any internal contours, which indicate holes
    for i in range(len(contours)):
        if _[0][i][3] != -1:
            return True
    
    return False

# Example usage
mask = np.array([[True, True, True, True],
                 [True, False, False, True],
                 [True, True, True, True]], dtype=bool)
print(has_holes(mask))  # Output: True
def has_multiple_unconnected_sections(mask):
    """
    Check if a 2D boolean mask has multiple unconnected sections of True values.
    
    Args:
        mask (np.ndarray): 2D boolean array.
    
    Returns:
        bool: True if the mask has multiple unconnected sections, False otherwise.
    """
    # Convert boolean mask to uint8
    mask_uint8 = mask.astype(np.uint8)
    
    # Find connected components
    num_labels, _ = cv2.connectedComponents(mask_uint8)
    
    # More than one label means multiple unconnected sections
    return num_labels > 2  # 0 is the background label, so we check for > 2

# Example usage
mask = np.array([[True, False, False, True],
                 [False, False, False, False],
                 [True, True, False, False]], dtype=bool)
print(has_multiple_unconnected_sections(mask))  # Output: True

If has_holes or has_multiple_unconnected_sections is True we should auto RLE.

SkalskiP avatar May 15 '24 14:05 SkalskiP

@SkalskiP Hi, I need help with tests. I cannot reproduce the segmentation fault from here on my local machine. It seems that it is caused by cv2.connectedComponents(mask_uint8, connectivity=4)

emsko avatar May 16 '24 07:05 emsko

Hi @emSko :wave:

Do a git pull - I've added a line that fixes the segfault.

I couldn't reproduce it as well, but found this issue

We'll eventually update the opencv to >=4.6.0.0 which allegedly resolves it, but for now this should be sufficient.

LinasKo avatar May 17 '24 08:05 LinasKo

Hi @LinasKo, thank you for your help.

@SkalskiP I think the PR is ready for review. I don't think that from_coco requires any documentation updates.

emsko avatar May 17 '24 09:05 emsko

Hi @emSko 👋🏻 At the outset, I apologize for the late reply - I was traveling at the end of the week and was not on GitHub. @LinasKo, thanks for your support! 🙏🏻

@emSko you did an excellent job here! We are already on the last straight. I left some final comments.

SkalskiP avatar May 20 '24 10:05 SkalskiP

Hi @emSko 👋🏻

I just tested your code in Google Colab. Everything seems to be working fine. I also made a few minor changes to the docs, including adding some visualizations. Everything looks great. Many thanks for this work. This is a massive upgrade for COCO dataset support. Working with you was a pleasure. Thank you! 🙏🏻

SkalskiP avatar May 21 '24 11:05 SkalskiP