AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard
In VGT,how to generate a png image into a pkl format Grid image???
I would like to ask, in VGT, how to infer your own document images, that is, how to generate a png image into a pkl format Grid image
Refer https://github.com/AlibabaResearch/AdvancedLiterateMachinery/issues/75 for answer
I also need to know how to generate a png image into apkl format Grid.
I also need to know how to generate a png image into apkl format Grid.
I read the Refer#75. there is a template for grid to generate a pkl file.is it correct?Is it mean i can make a ocr for image and populate the OCR results into this template to make pkl file
dict_for_grid = { 'input_ids': [], 'bbox_subword_list': [], 'texts': [], 'bbox_texts_list': [], }
The pr currently only handles creating grid from a machine readable pdf. For png most probably, you'd need to use a ocr to populate the fields in dict_for_grid