supervision [Detections] extend `from_transformers` with segmentation models support

Description

Currently, Supervision only supports Transformers object detection models. Let's expand from_transformers by adding support for segmentation models.

API

The code below should enable the annotation of an image with segmentation results.

import torch
import supervision as sv
from PIL import Image
from transformers import DetrImageProcessor, DetrForSegmentation

processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50")

image = Image.open(<PATH TO IMAGE>)
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

width, height = image.size
target_size = torch.tensor([[height, width]])
results = processor. post_process_segmentation(
    outputs=outputs, target_sizes=target_size)[0]
detections = sv.Detections.from_transformers(results)

mask_annotator = sv.MaskAnnotator()

annotated_image = mask.annotate(scene=image, detections=detections)

Additional

Transformers DETR Docs
Note: Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏🏻

Mar 25 '24 12:03 SkalskiP

I'd like to take this one.

Mar 25 '24 20:03 Griffin-Sullivan

@Griffin-Sullivan task is yours good luck :)

Mar 25 '24 21:03 onuralpszr

So the post_process_segmentation method only returns ['scores', 'labels', 'masks']. I found a utility function to get the xyxy from the mask https://github.com/roboflow/supervision/blob/42f9d4b69b9d2c97756d69184359bdc484e896a8/supervision/detection/utils.py#L307. Is this the right approach here? When I run everything it works and I get detections but the annotations don't come out right. You can see in my colab there's only one colored line at the top of the annotated picture.

Here's the colab: https://colab.research.google.com/drive/1j1O95pxlHDPZ1XtmLJ_VcmkYAk_KvrUg?usp=sharing

And my draft PR: https://github.com/roboflow/supervision/pull/1054

Mar 26 '24 18:03 Griffin-Sullivan

Hi @Griffin-Sullivan~ 👋🏻 Yes mask_to_xyxy is what you need. Looks like your problem was mostly related to np.array dtype. I shared more details in the comment under your PR.

Mar 26 '24 20:03 SkalskiP

This issue was solved via https://github.com/roboflow/supervision/pull/1054. I'm closing this issue.

Mar 28 '24 13:03 SkalskiP

Hi @SkalskiP,

I've reviewed the HF transformers code - #1054 didn't add support for instance and panoptic segmentation. The result dicts from HF are much different in those cases.

I could open an issue. Is that something we want?

Apr 02 '24 11:04 LinasKo

@LinasKo are those potential changes reflected in comments under this PR: https://github.com/roboflow/supervision/pull/1046, or is it more than this?

Apr 02 '24 12:04 SkalskiP

@SkalskiP The missing part is where I suggested raising NoImplementedError which passing instance and panoptic seg results would trigger.

Apr 02 '24 12:04 LinasKo

supervision supervision copied to clipboard

[Detections] extend `from_transformers` with segmentation models support

Description

API

Additional

supervision
supervision copied to clipboard