PaddleOCR issues

pdf中原文放到txt中会比普通字符矮一些的非正常字符是什么字符？怎么转换成正常字符？

2

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment： - 版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components： - 运行指令/Command Code： - 完整报错/Complete Error Message： ![image](https://github.com/PaddlePaddle/PaddleOCR/assets/41010669/b342bdc2-829d-4126-9adb-027f67ba437f)

nissansz

ModuleNotFoundError: No module named 'paddle'

2

``` (paddle_env) ☁ PaddleOCR [main] ⚡ python tools/infer_kie_token_ser_re.py \ -o Architecture.Backbone.checkpoints=/Users/qyongkang/working/paddle/re_LayoutXLM_xfun_zh/ \ Global.infer_img=/Users/qyongkang/working/paddle/form/ \ -o_ser Architecture.Backbone.checkpoints=/Users/qyongkang/working/paddle/ser_LayoutXLM_xfun_zh/ Traceback (most recent call last): File "/Users/qyongkang/working/PaddleOCR/tools/infer_kie_token_ser_re.py", line 31, in import paddle ModuleNotFoundError: No...

seeeyou

Code PR is needed

微调det_v4，hmean很低

16

微调det_v4，loss从降低4.9降到1.2，但是precision，recall，hmean都很低，基本是0.04，这种情况原因是啥？

ymy1005

使用GPU时检测不到文字但CPU可以

用的官方默认模型，使用GPU时检测不到文字但CPU可以（无论官方样例图片还是自己图片都是这样）。 - 系统环境/System Environment：Windows 10 - 版本号/Version：PaddlePaddle-GPU：2.5.2 PaddleOCR：2.7.0.2 CPU下（`paddleocr --image_dir .\pages\01.jpg --use_gpu false`）输出： ``` [2024/05/14 08:17:25] ppocr DEBUG: dt_boxes num : 80, elapse : 1.6079998016357422 [2024/05/14 08:18:30] ppocr DEBUG: rec_res...

sdflkjssl

PP-OCRv4在线体验版使用离线推理的效果不一致

8

无法获取在线体验版的使用的模型版本，分别尝试了det模型的ch_PP-OCRv4 infer版和server_infer版，以及rec模型的infer版和server_infer版，发现效果均差于线上体验版。例如识别图片： ![香菇粉](https://github.com/PaddlePaddle/PaddleOCR/assets/153589998/5051ea6e-440a-4586-8bb6-06e4ff5322ce) - 线上版结果： Black Sesame Mushroom Powder 黑芝麻香菇粉 ESTLADY -离线版结果： ('Black', 0.9862731099128723) ('Desame', 0.8805373311042786) ('Mushroom Powder', 0.9594960808753967) ('黑芝麻香菇粉', 0.9991157054901123) 对比结果离线版错将Sesame识别为Desame，漏掉了NESTLADY

linssonSUSUSU

paddleocr模型训练

2

paddleocr模型训练中用了预训练模型，但训练前期的acc为零，是否正常

Darren0465

hubserving部署structure_table效果比通用表格识别差不少

4

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment：Linux - 版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components： - 运行指令/Command Code： - 完整报错/Complete Error Message：使用hubserving部署structure_table接口，用的模型是`ch_ppstructure_mobile_v2.0_SLANet`，调用后发现对于同一张表格图片识别效果一般，但是通用[表格识别demo](https://aistudio.baidu.com/community/app/91661/webUI)能识别正确，这是因为啥呢？

Tongjilibo

paddleocr训练自己数据，文本检测不出结果！

4

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment： - 版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components： - 运行指令/Command Code： - 完整报错/Complete Error Message： 1.环境： win10、gpu、cuda11.2、cudnn8.2、paddocr2.7 2.数据：按照ppocrlabel进行标注，并且显示没有问题； 3.训练：...

gengyanlei

字符间距大跟空格很像，造成空白区域被误识别成错误字符，如何解决？

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment： - 版本号/Version：Paddle： PaddleOCR：2.7.5 问题相关组件/Related components： - 运行指令/Command Code： - 完整报错/Complete Error Message： - 字符间距大跟空格很像，造成空白区域被误识别成错误字符，如何解决？如何判断字符间距是否大，从而通过压扁图片，缩小间距改善结果？原图 ![image](https://github.com/PaddlePaddle/PaddleOCR/assets/41010669/9e6e8751-009c-4e2b-b8fb-597b54edf48e) 识别结果有时会出现多余错误字符...

nissansz

Error with pyclipper inhomogeneous expanded array

1

In case of ` det_box_type='poly'`, for some images, `np.array(offset.Execute(distance))` can result in inhomogeneous part of the detection box list, which cannot be casted into numpy array directly. Due to this...

zovelsanj

contributor

PaddleOCR
PaddleOCR copied to clipboard

Metadata

pdf中原文放到txt中会比普通字符矮一些的非正常字符是什么字符？怎么转换成正常字符？

ModuleNotFoundError: No module named 'paddle'

微调det_v4，hmean很低

使用GPU时检测不到文字但CPU可以

PP-OCRv4在线体验版使用离线推理的效果不一致

paddleocr模型训练

hubserving部署structure_table效果比通用表格识别差不少

paddleocr训练自己数据，文本检测不出结果！

字符间距大跟空格很像，造成空白区域被误识别成错误字符，如何解决？

Error with pyclipper inhomogeneous expanded array

← Metadata

Owner

Metadata

PaddleOCR PaddleOCR copied to clipboard

Metadata

← Metadata

Owner

Metadata

PaddleOCR
PaddleOCR copied to clipboard