transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add ConvNeXt Mask R-CNN

Open NielsRogge opened this issue 3 years ago • 0 comments

What does this PR do?

This PR is an initial draft for implementing the classic Mask R-CNN framework with ConvNeXt as backbone.

The framework is implemented in a single script, with the exception of 3 files (for now):

  • assign_result.py
  • losses.py
  • mask_target.py

As we have a one model, one file policy, I'm reimplementing ConvNeXT leveraging Copied from statements. So ConvNextMaskRCNNModel is almost identical to ConvNextModel. This way, the backbone used for object detection stays independent from the original one. In this case for instance, extra layernorms are added after each stage.

There's a dependency on torchvision, which is used for NMS (non-maximum suppression, a postprocessing algorithm used by both the RPN head and the RoI head).

NielsRogge avatar Aug 08 '22 17:08 NielsRogge