layout-parser
layout-parser copied to clipboard
layoutparser doens't work well for a very well-structured CV
Describe the bug layoutparser doens;t work well for a very well-structured CV, Am I using layoutparser in the wrong way? could you please help to check? Thanks very much.
To Reproduce
import layoutparser as lp
import cv2
import ssl
import warnings
ssl._create_default_https_context = ssl._create_unverified_context
warnings.filterwarnings('ignore')
image = cv2.imread("data/25.png")
image = image[..., ::-1]
model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
layout = model.detect(image)
print(layout)
# Detect the layout of the input image
lp.draw_box(image, layout, box_width=3).show()
Environment
- macos
- use below command to install layoutparser
- pip install layoutparser torchvision && pip install "detectron2@git+https://github.com/facebookresearch/[email protected]#egg=detectron2"
- Python 3.9.1
Screenshots If applicable, add screenshots to help explain your problem.
I'm facing the same kind of difficulties. When applying to CVs, the results are very poor.
have you tried working with different models? PrimaLayout for example gives me quite better results on a similar set of documents.
model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
but they are still not perfect (that's) why i came here ;) - are there any options to tweak the text detection?