layout-model-training
                                
                                 layout-model-training copied to clipboard
                                
                                    layout-model-training copied to clipboard
                            
                            
                            
                        The scripts for training Detectron2-based Layout Models on popular layout analysis datasets
Scripts for training Layout Detection Models using Detectron2
Usage
Directory Structure
- In tools/, we provide a series of handy scripts for converting data formats and training the models.
- In scripts/, it lists specific command for running the code for processing the given dataset.
- The configs/contains the configuration for different deep learning models, and is organized by datasets.
How to train the models?
- Get the dataset and annotations -- if you are not sure, feel free to check this tutorial.
- Duplicate and modify the config files and training scripts
- For example, you might want to copy configs/prima/fast_rcnn_R_50_FPN_3xtoconfigs/your-dataset-name/fast_rcnn_R_50_FPN_3x, and you can create your ownscripts/train_<your-dataset-name>.shbased onscripts/train_prima.sh.
- You'll modify the --dataset_name,--json_annotation_train,--image_path_train,--json_annotation_val,--image_path_val, and--config-fileargs appropriately.
 
- For example, you might want to copy 
- If you have a dataset with segmentation masks, you can try to train with the mask_rcnn model; otherwise you might want to start with thefast_rcnn model- If you see error AttributeError: Cannot find field 'gt_masks' in the given Instances!during training, this means you should not use
 
- If you see error 
Supported Datasets
- Prima Layout Analysis Dataset scripts/train_prima.sh- You will need to download the dataset from the official website and put it in the data/primafolder.
- As the original dataset is stored in the PAGE format, the script will use tools/convert_prima_to_coco.pyto convert it to COCO format.
- The final dataset folder structure should look like:
data/ └── prima/ ├── Images/ ├── XML/ ├── License.txt └── annotations*.json
 
- You will need to download the dataset from the official website and put it in the 
Reference
- cocosplit A script that splits the coco annotations into train and test sets.
- Detectron2 Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms.