PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PPStructure: unable to recognize a fairly easy structure

Open vlavorini opened this issue 1 year ago • 0 comments

I am trying to parse this PDF using PaddleOCR 2.7.3.

I tried converting the pages as images, and then run PPStructure on them. I tried with the following options:

engine = PPStructure(show_log=True, image_orientation=True)

engine = PPStructure(show_log=True, image_orientation=True, lan='en')

engine = PPStructure((show_log=True, image_orientation=True, lan='en',  layout_model_dir=./picodet_lcnet_x1_0_fgd_layout_infer',  layout_dict_path='./layout_publaynet_dict.txt',)

but the results in the second page of the document are not satisfactory: page_1

I also tried with the model ppyolov2_r50vd_dcn_365e_publaynet:

engine = PPStructure(show_log=True, image_orientation=True, lan='en',
                      layout_model_dir='./ppyolov2_r50vd_dcn_365e_publaynet', 
                       layout_dict_path=./layout_publaynet_dict.txt',

but the program stops at an error: InvalidArgumentError: The size of Op(Conv) inputs should not be 0.

Any suggestion on how to correctly parse this pdf?

Thank you!

vlavorini avatar Apr 30 '24 12:04 vlavorini