layout-parser icon indicating copy to clipboard operation
layout-parser copied to clipboard

Fine tuning on Custom Dataset while using the pre-trained weights (with different classes than the original model)

Open deshwalmahesh opened this issue 3 years ago • 4 comments

Before someone sends me to the model training repo, please let me just explain.

I want to fine tune the existing model, say PubLayNet/faster_rcnn_R_50_FPN_3x model for my own task BUT for a Single Class, ex: text Detection only where the mapping is as {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}

Or maybe on HJDataset where classes are {1:"Page Frame", 2:"Row", 3:"Title Region", 4:"Text Region", 5:"Title", 6:"Subtitle", 7:"Other"}.

I found this Kaggle Notebook on fine tuning with Detectron2 for fine tuning but the problem is what I have described earlier that I just want to train on 1 class.

What would be the changes that I'll have to do? How would the things_classes look like?

thing_classes= ['text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1

thing_classes= ["None",'text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2

thing_classes= ['text', 'None', 'None', 'None', 'None'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 5

thing_classes= ['None', 'text', 'None', 'None', 'None', 'None'] #  cfg.MODEL.ROI_HEADS.NUM_CLASSES = 6

thing_classes= ['text', 'None', 'None', 'None', 'None', 'None'] #  cfg.MODEL.ROI_HEADS.NUM_CLASSES = 6

  1. Would it be any different if I use Layout Parser Model Config for faster_rcnn_R_50_FPN_3x instead of the default one from Detectron2/configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml

Thanks in advance.

deshwalmahesh avatar Aug 25 '22 04:08 deshwalmahesh

Use : thing_classes= ["None",'text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2

for more details https://layout-parser.readthedocs.io/en/latest/example/deep_layout_parsing/index.html

AmmarNassan avatar Aug 25 '22 07:08 AmmarNassan

Hello Sir I have been trying to train this layout parser using your kaggle notebook and I want to fine tune it only for the table , and as per you Screenshot (2101) r answer I tried using this format [None,Table],but it is showing zero images in table and 26 in None,also if I train only Table bank model which is for table only can ,do I still have to uses this format and also can you tell me from where can download the weights for table Uploading Screenshot (2100).png… Uploading Screenshot (2101).png… Uploading Screenshot (2100).png… bank faster rcnn table bank and is there any link that you can provide so that I can further go deep inside this.Thanks you very much in advance.

Prakhar2295 avatar Jan 30 '24 15:01 Prakhar2295

Uploading Screenshot (2093).png…

Prakhar2295 avatar Jan 30 '24 17:01 Prakhar2295

Screenshot (2094) Screenshot (2095) Screenshot (2096) Screenshot (2097) Screenshot (2098) Screenshot (2099) Screenshot (2100) Screenshot (2101) Screenshot (2102)

Prakhar2295 avatar Jan 30 '24 18:01 Prakhar2295