layout-parser
layout-parser copied to clipboard
Detecting graphs and figures in the PDF images
Thanks for building this library.
Used this code to detect if an image contains graphs and charts.
layout = 'lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config'
model = lp.Detectron2LayoutModel(layout,
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
label_map={0: "Text", 1: "Title", 2: "List", 3: "Table", 4: "Figure"})
image = lp.draw_box(image, text_blocks, box_width=3, show_element_id=True, show_element_type=True)
For most cases, the charts/graphs are marked as Figures

However, there are some anomalies.
-
Multiple sections in the same chart are marked as Figures

-
Only a partial section in the chart is marked as a Figure.

Are there any other models that can be used to detect charts/graphs more effectively? If not, any ideas on how to create and train a custom model for improved detection? More detailed the steps, the better - a python beginner here.
Thanks!