custom_d_fine icon indicating copy to clipboard operation
custom_d_fine copied to clipboard

[Feature request] Segmentation head

Open johnlockejrr opened this issue 2 months ago • 13 comments

Hi there! Sorry to open an issue for no issue at all. I found this repo yesterday being forked by an acquaintance. I don't want to praise or anything but just: wow! YOLO detection models are far behind.

Is there any chance the model to have in the future segmentation (like polygon masks not just or no bounding boxes)? That would be amazing!

🔧 Changes Needed:

  1. Model Architecture:
  • Replace bbox head with polygon/segmentation head (or add not replace!)
  • Variable output length - polygons have different numbers of points
  • New loss functions - polygon IoU, Chamfer distance, etc.
  • Output format - [x1, y1, x2, y2, ..., xn, yn] instead of [x1, y1, x2, y2]
  1. Data Pipeline:
  • YOLO segmentation format - class x1 y1 x2 y2 ... xn yn (normalized)
  • Variable-length handling - different polygons have different point counts
  • Augmentation updates - transform all polygon points correctly
  1. Training Changes:
  • New loss functions - polygon-specific losses
  • No pretrained weights - would need to train from scratch
  • Memory management - variable-length outputs are memory intensive

Thank you for yor work! 🥇

johnlockejrr avatar Oct 03 '25 13:10 johnlockejrr

Hey, I totally agree. That's my next big task for this repo, as soon as I find some time - I will do it. I'll keep you posted. Thanks for kind words

ArgoHA avatar Oct 03 '25 15:10 ArgoHA

Can't wait! But worth the waiting. Is really amazing, and believe me I talk from a vast experience, I tested almost everything worth testing for segmentation (I work mainly on OCR/HTR) and for print this is perfect, I look for polygon masks in my work with manuscripts with curved or not so much straight lines for handwritten material.

johnlockejrr avatar Oct 03 '25 15:10 johnlockejrr

I tried the dev_seg branch but maybe too soon 🙂

johnlockejrr avatar Oct 23 '25 07:10 johnlockejrr

It is too soon, I only implemented masks to the pipeline and simple segmentation head. It doesn't work well. Will be researching the segmentation head implementation ideas

ArgoHA avatar Oct 23 '25 11:10 ArgoHA

Fingers crossed, dude! 👍

johnlockejrr avatar Oct 23 '25 12:10 johnlockejrr

It is too soon, I only implemented masks to the pipeline and simple segmentation head. It doesn't work well. Will be researching the segmentation head implementation ideas

Like RF-DETR-Seg?

Nuzhny007 avatar Oct 23 '25 12:10 Nuzhny007

I haven't looked into rf detr seg, I was referring to mask dino. I believe rf dete just released their segmentation model

ArgoHA avatar Oct 23 '25 12:10 ArgoHA

And rf-detr segmentation is not that amazing! Tried it, not better than YOLO seg, for me at least is no go…

On Thu, 23 Oct 2025 at 14:25, Argo Saakyan @.***> wrote:

ArgoHA left a comment (ArgoHA/custom_d_fine#36) https://github.com/ArgoHA/custom_d_fine/issues/36#issuecomment-3436658360

I haven't looked into rf detr seg, I was referring to mask dino. I believe rf dete just released their segmentation model

— Reply to this email directly, view it on GitHub https://github.com/ArgoHA/custom_d_fine/issues/36#issuecomment-3436658360, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD44GHWEVBNNPRHCLS3RDVT3ZDCL7AVCNFSM6AAAAACIGTCTZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMZWGY2TQMZWGA . You are receiving this because you authored the thread.Message ID: @.***>

johnlockejrr avatar Oct 23 '25 12:10 johnlockejrr

I haven't looked into rf detr seg, I was referring to mask dino. I believe rf dete just released their segmentation model

Better go with Dino, at least for the detection d_fine is amazing!

rf-detr segmentation result example:

Image

and this is custom_d_fine not fully trained.

Image

johnlockejrr avatar Oct 23 '25 12:10 johnlockejrr

Rf-detr skipped some lines, interesting, thanks for sharing

ArgoHA avatar Oct 23 '25 13:10 ArgoHA

No problem. Yes, so is not reliable for my case. Your model doesn't, maybe a line in 100, but is about data and training.

On Thu, 23 Oct 2025 at 15:49, Argo Saakyan @.***> wrote:

ArgoHA left a comment (ArgoHA/custom_d_fine#36) https://github.com/ArgoHA/custom_d_fine/issues/36#issuecomment-3437129337

Rf-detr skipped some lines, interesting, thanks for sharing

— Reply to this email directly, view it on GitHub https://github.com/ArgoHA/custom_d_fine/issues/36#issuecomment-3437129337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD44GHSDT6TJIADS4NTWSOD3ZDMFJAVCNFSM6AAAAACIGTCTZOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTIMZXGEZDSMZTG4 . You are receiving this because you authored the thread.Message ID: @.***>

johnlockejrr avatar Oct 23 '25 15:10 johnlockejrr

Anyway, I had much hopes for rf-detr but at least in my case is just a YOLO harder to train and that's all, even worse.

johnlockejrr avatar Oct 23 '25 15:10 johnlockejrr

Anyway, I had much hopes for rf-detr but at least in my case is just a YOLO harder to train and that's all, even worse.

I've just read through this and @johnlockejrr I'd love to get some recommendations from you as per well performing instance segmentation models with option to fine tune to custom data. I also tried rfdetrsegpreview but it did not perform as well as a yolo.

Although the rfdetr detection models are very good, Itried them a while ago but the segmentation model recently released is just okay.

AgbajeAyomipo avatar Oct 24 '25 18:10 AgbajeAyomipo