transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add Mask R-CNN

Open NielsRogge opened this issue 1 year ago • 2 comments

What does this PR do?

This PR adds the classic Mask R-CNN framework for object detection and instance segmentation.

To do/to be discussed:

  • [ ] where to place utilities like NMS, loss computation, samplers
  • [ ] whether to create dummies for torchvision-backed models
  • [ ] how to add support for the object detection pipeline - either add **kwargs to each post_process_object_detection method, or add specific logic for Mask R-CNN inside object_detection_pipeline.py

NielsRogge avatar Apr 24 '23 20:04 NielsRogge

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@NielsRogge As @sgugger mentions, the PR is still in WIP state. Happy to review once transformers ready :)

amyeroberts avatar Apr 26 '23 10:04 amyeroberts

I've updated all docstrings and variable names, PR is ready for another review

NielsRogge avatar May 08 '23 08:05 NielsRogge

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jun 13 '23 15:06 github-actions[bot]

Hello @NielsRogge, what is the status of this feature? Thanks in advance

aymanechilah avatar Jul 04 '23 12:07 aymanechilah

Hi, @NielsRogge looking forward to it. Could you, for now, recommend a robust text detector available here to combine with TrOCR. I would like to see how well the two work with the help of HF🤗.

bit-scientist avatar Aug 29 '23 08:08 bit-scientist