layout-parser icon indicating copy to clipboard operation
layout-parser copied to clipboard

In what format are the coordinates of bounding boxes returned?

Open Sopralapanca opened this issue 8 months ago • 0 comments

I am using layout-parser to detect bounding boxes related to tables in a pdf file. When the coordinates are accessed in this way:

model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})

# Detect tables
layout = model.detect(image_rgb)

for el in layout:
  coords = el.coordinates
  print(coords)

In what format are they returned? Could they be restitute in the format x1, y1, x2, y2 where (x1,y1) identifies the upper-left corner and (x2,y2) the lower-right corner and the origin of the coordinates is (0,0) in the lower-left corner?

Sopralapanca avatar Feb 25 '25 16:02 Sopralapanca