AdvancedLiterateMachinery icon indicating copy to clipboard operation
AdvancedLiterateMachinery copied to clipboard

In VGT,how to generate a png image into a pkl format Grid image???

Open yexujian opened this issue 1 year ago • 5 comments
trafficstars

I would like to ask, in VGT, how to infer your own document images, that is, how to generate a png image into a pkl format Grid image

yexujian avatar Jan 12 '24 06:01 yexujian

Refer https://github.com/AlibabaResearch/AdvancedLiterateMachinery/issues/75 for answer

yashsandansing avatar Feb 27 '24 06:02 yashsandansing

I also need to know how to generate a png image into apkl format Grid.

yekesy avatar Apr 09 '24 12:04 yekesy

I also need to know how to generate a png image into apkl format Grid.

yekesy avatar Apr 09 '24 12:04 yekesy

I read the Refer#75. there is a template for grid to generate a pkl file.is it correct?Is it mean i can make a ocr for image and populate the OCR results into this template to make pkl file dict_for_grid = { 'input_ids': [], 'bbox_subword_list': [], 'texts': [], 'bbox_texts_list': [], }

yekesy avatar Apr 09 '24 12:04 yekesy

The pr currently only handles creating grid from a machine readable pdf. For png most probably, you'd need to use a ocr to populate the fields in dict_for_grid

yashsandansing avatar Apr 09 '24 12:04 yashsandansing