fiftyone [CVAT integration] Use pixelwise masks, not polygons, for instance segmentation

Proposal Summary

Following the FiftyOne's document, I figured out that FiftyOne's instance segmentation works with polygons rather than pixel-wise masked segments. For integrating CVAT and FiftyOne, instance segmentation would be much better when annotated with pixelwise masks rather than polygons.

Motivation

What is the use case for this feature?

I am trying to integrate CVAT with FiftyOne, whilst using SAM for segmentation in CVAT.

Why is this use case valuable to support for FiftyOne users in general?

This issue has already been suggested in https://github.com/voxel51/fiftyone/issues/3750. It seems that many users want pixelwise masks rather than polygons for segmentation annotations.

Why is this use case valuable to support for your project(s) or organization?

When annotating segmentation using SAM in CVAT, the annotation results are masked segments, not polygons. It is needed to load dataset and annotations with masks in FiftyOne, load the annotated dataset into CVAT to add/edit/delete some annotations, and load the modified annotations back into FiftyOne. Expected annotation type is masked segments, not polygons, but instance segmentation for integration CVAT and FiftyOne seems to work only with polygons which don't appear to be expected as pixelwise segments.

Why is it currently difficult to achieve this use case?

When I try to load detections of FiftyOne into CVAT, all annotations are gone and only images are loaded in CVAT. Also when I annotate segmentation using SAM in CVAT and try to load the annotated image data into FiftyOne, again any annotations disappear and again soley images are shown in FiftyOne App. When I set label_type=instance as in the code line below

dataset.annotate(
    anno_key,
    label_field='ground_truth',
    label_type='instance',
    url='http://localhost:8080/'
)

then the annotations are loaded but in polygons, not in pixel-wise segments.

What areas of FiftyOne does this feature affect?

[x] App: FiftyOne application
[x] Core: Core fiftyone Python library
[ ] Server: FiftyOne server

Details

Expected implementations for instance segmentation with pixelwise segments:

dataset.annotate(
    anno_key,
    label_field='detections',
    label_type='instances',
    url='http://localhost:8080/'
)

Expected implementations for instance segmentation with polygons:

dataset.annotate(
    anno_key,
    label_field='detections',
    label_type='polygons',
    url='http://localhost:8080/'
)

Willingness to contribute

The FiftyOne Community welcomes contributions! Would you or another member of your organization be willing to contribute an implementation of this feature?

[ ] Yes. I can contribute this feature independently
[x] Yes. I would be willing to contribute this feature with guidance from the FiftyOne community
[ ] No. I cannot contribute this feature at this time

Jun 11 '24 21:06 auee028

I'm having the exact same issue. I guess the current workaround is to export the masks from CVAT and add the masks to each image with a script

Jun 19 '24 08:06 karl-joan

I am facing the same issue, and it becomes even more of a concern when the mask contains islands or holes, as CVAT does not support polygons with holes. This has become a bottleneck for us. I would also be willing to contribute to a solution with guidance from the FiftyOne team.

Sep 28 '24 21:09 NicDionne

Hi @NicDionne and @auee028 👋

Apologies for the delayed response and thanks for the feature request and your willingness to contribute a solution! I'd be happy to point you in the right direction 😄

I agree with the proposal above where this syntax:

dataset.annotate(anno_key, label_field='detections', label_type='instances')

should upload instance masks as pixelwise masks rather than as polygons.

For context, I believe the only reason that the current implementation uses polygons is that pixelwise masks weren't supported by CVAT when this integration was originally built!

Here's the code that needs to be updated:

When uploading labels to CVAT: https://github.com/voxel51/fiftyone/blob/develop/fiftyone/utils/cvat.py#L6402-L6430
When converting downloaded CVAT shapes back into FiftyOne format: https://github.com/voxel51/fiftyone/blob/77f664c6da47ce05abb36326237fc10aef5fe999/fiftyone/utils/cvat.py#L5908

Oct 02 '24 14:10 brimoor

Thanks, @brimoor for the guidance. I'll look into it.

Oct 06 '24 21:10 NicDionne

Not to overload this issue, but maybe to seed more development on a similar task - what would be the best way to extend this to semantic labels that are uploaded with label_type="segementation"?

I suppose an alternative would be to only allow users to use instances for CVAT annotation and you can fuse instances -> semantic labels when you need them later during a dataset export, etc.

Oct 25 '24 12:10 geke-mir