supervision
supervision copied to clipboard
[Detections] extend `from_transformers` with segmentation models support
Description
Currently, Supervision only supports Transformers object detection models. Let's expand from_transformers by adding support for segmentation models.
API
The code below should enable the annotation of an image with segmentation results.
import torch
import supervision as sv
from PIL import Image
from transformers import DetrImageProcessor, DetrForSegmentation
processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50")
image = Image.open(<PATH TO IMAGE>)
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
width, height = image.size
target_size = torch.tensor([[height, width]])
results = processor. post_process_segmentation(
outputs=outputs, target_sizes=target_size)[0]
detections = sv.Detections.from_transformers(results)
mask_annotator = sv.MaskAnnotator()
annotated_image = mask.annotate(scene=image, detections=detections)
Additional
- Transformers DETR Docs
- Note: Please share a Google Colab with minimal code to test the new feature. We know it's additional work, but it will speed up the review process. The reviewer must test each change. Setting up a local environment to do this is time-consuming. Please ensure that Google Colab can be accessed without any issues (make it public). Thank you! 🙏🏻
I'd like to take this one.
@Griffin-Sullivan task is yours good luck :)
So the post_process_segmentation method only returns ['scores', 'labels', 'masks']. I found a utility function to get the xyxy from the mask https://github.com/roboflow/supervision/blob/42f9d4b69b9d2c97756d69184359bdc484e896a8/supervision/detection/utils.py#L307. Is this the right approach here? When I run everything it works and I get detections but the annotations don't come out right. You can see in my colab there's only one colored line at the top of the annotated picture.
Here's the colab: https://colab.research.google.com/drive/1j1O95pxlHDPZ1XtmLJ_VcmkYAk_KvrUg?usp=sharing
And my draft PR: https://github.com/roboflow/supervision/pull/1054
Hi @Griffin-Sullivan~ 👋🏻 Yes mask_to_xyxy is what you need. Looks like your problem was mostly related to np.array dtype. I shared more details in the comment under your PR.
This issue was solved via https://github.com/roboflow/supervision/pull/1054. I'm closing this issue.
Hi @SkalskiP,
I've reviewed the HF transformers code - #1054 didn't add support for instance and panoptic segmentation. The result dicts from HF are much different in those cases.
I could open an issue. Is that something we want?
@LinasKo are those potential changes reflected in comments under this PR: https://github.com/roboflow/supervision/pull/1046, or is it more than this?
@SkalskiP The missing part is where I suggested raising NoImplementedError which passing instance and panoptic seg results would trigger.