PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

PPStructure missing text that PaddleOCR do not miss

Open omeruth opened this issue 7 months ago • 7 comments

问题描述 / Problem Description

PPStructure missing text that PaddleOCR do not miss

运行环境 / Runtime Environment

  • OS:
  • Paddle:
  • PaddleOCR:

复现代码 / Reproduction Code

PaddleOCR(lang='en', use_angle_cls=True, use_gpu=True)

PPStructure(show_log=True, image_orientation=True, structure_version='PP-StructureV2',recovery=True)

完整报错 / Complete Error Message

PaddleOCR output is good but when it comes to tables it messes up. So I thought to use PPStructure which gives very good results for tables as well. But I noticed it has tendency to miss certain parts of the documents completely where PaddleOCR works fine. Is there a possibility to use PaddleOCR to extract text + tables separately without using PPStructure. Or is there a way that helps PPstructure do not miss any text just like PaddleOCR? Thanks

可能解决方案 / Possible solutions

附件 / Appendix

omeruth avatar Jul 15 '24 14:07 omeruth