latr
latr copied to clipboard
An error in dataset.py : create_features function
I think it should be if use_ocr == True: entries = apply_ocr(img_path) bounding_box = entries["bbox"] words = entries["words"] bounding_box = list(map(lambda x: resize_align_bbox(x,width_old,height_old, width, height), bounding_box)) the line : bounding_box = list(map(lambda x: resize_align_bbox(x,width_old,height_old, width, height), bounding_box)) must in the if use_ocr == True:
Because in the OCRDATASET:
for i in sample_entry[1]['Blocks']:
if i['BlockType']=='WORD' and i['Page']==1:
words.append(i['Text'].lower())
curr_box = i['Geometry']['BoundingBox']
xmin, ymin, xmax, ymax = curr_box['Left'], curr_box['Top'], curr_box['Width']+ curr_box['Left'], curr_box['Height']+ curr_box['Top']
curr_bbox = resize_align_bbox([xmin, ymin, xmax, ymax], 1, 1, width, height)
coordinates.append(curr_bbox)
## Similar to the docformer's create_features function, but with some changes
img, boxes, tokenized_words = create_features(image_path = tif_path,
tokenizer = self.tokenizer,
target_size = (1000, 1000),
use_ocr = False,
bounding_box = coordinates,
words = words
)
you know, the curr_bbox is actually in the size(1000,1000)