layout-parser icon indicating copy to clipboard operation
layout-parser copied to clipboard

layoutparser doens't work well for a very well-structured CV

Open ttbuffey opened this issue 4 years ago • 2 comments

Describe the bug layoutparser doens;t work well for a very well-structured CV, Am I using layoutparser in the wrong way? could you please help to check? Thanks very much.

To Reproduce

import layoutparser as lp
import cv2
import ssl
import warnings
ssl._create_default_https_context = ssl._create_unverified_context
warnings.filterwarnings('ignore')

image = cv2.imread("data/25.png")
image = image[..., ::-1]
model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
layout = model.detect(image)
print(layout)
    # Detect the layout of the input image
lp.draw_box(image, layout, box_width=3).show()

Environment

  1. macos
  2. use below command to install layoutparser
    • pip install layoutparser torchvision && pip install "detectron2@git+https://github.com/facebookresearch/[email protected]#egg=detectron2"
    • Python 3.9.1

Screenshots If applicable, add screenshots to help explain your problem.

Screen Shot 2021-12-02 at 3 51 32 PM Screen Shot 2021-12-02 at 3 51 40 PM Screen Shot 2021-12-02 at 3 42 58 PM

ttbuffey avatar Dec 02 '21 08:12 ttbuffey

I'm facing the same kind of difficulties. When applying to CVs, the results are very poor.

ruben-as-teixeira avatar Feb 07 '22 17:02 ruben-as-teixeira

have you tried working with different models? PrimaLayout for example gives me quite better results on a similar set of documents.

model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})

but they are still not perfect (that's) why i came here ;) - are there any options to tweak the text detection?

Bergrebell avatar Aug 10 '22 08:08 Bergrebell