PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and de...

Results 1088 PaddleOCR issues
Sort by recently updated
recently updated
newest added

I am using the tablebank layout detector and the ocr model of paddleocr to detect tables in an image and extract the text in the detected table to a csv...

__描述:__ 在使用 PaddleOCR 包2小时后,遇到了内存泄漏问题,这是在 CPU 机器上。尽管在循环中处理图像,但 OCR 的内存使用量持续增加,没有任何内存释放,最终导致内存耗尽。 __环境:__ PaddleOCR 版本:2.73 Python 版本:3.11.8 操作系统:Window 10 + 安装以下包: paddlepaddle==2.6.1 -i https://pypi.tuna.tsinghua.edu.cn/simpleol paddleocr==2.73 hanzidentifier==1.1.0 pillow==10.3.0 fastapi[all]==0.110.3 __期望行为:__ OCR 处理期间内存使用量应保持稳定或逐渐增加,但应在处理每个图像后释放,以防止内存耗尽。

bug

PaddleOCR seems to be very nice way to OCR documents. There is project called ocrmypdf https://github.com/ocrmypdf/OCRmyPDF which has plugin system, where HOCR -compliant OCR engines can be integrated (it is...

Code PR is needed

## 背景 经过需求征集https://github.com/PaddlePaddle/PaddleOCR/issues/10334 和每周技术研讨会 https://github.com/PaddlePaddle/PaddleOCR/issues/10223 讨论,我们确定了新增生僻字模型的任务。 ## 解决步骤 1. 替换现有字典txt为扩充《通用规范汉字表》的字典。 2. 在现有数据集上通过数据合成copy paste等方式实现语料的平衡,并重新训练PPOCRV3的检测和识别模型。 3. 对比训练后模型在普通文字和生僻字上的检测、识别精度,并和PPOCRV3模型最优模型进行对比;达到普通字精度不变或者更高,生僻字上精度进一步提升的效果。 5. 提交PR到ppocr,替换最优模型。

- 系统环境/System Environment: - 版本号/Version:Paddle: - PaddleOCR: 问题相关组件/Related components: - 运行指令/Command Code: ``` FROM registry.baidubce.com/paddlepaddle/fastdeploy:1.0.7-gpu-cuda11.4-trt8.5-21.10 COPY ./models-gpu.tar /ocr_serving/ WORKDIR /ocr_serving RUN tar -xf models-gpu.tar RUN rm models-gpu.tar EXPOSE 8000 CMD...

Hi, I am trying to run the following code: ``` python3 table/predict_table.py --image_dir=/scratch/rrs99/PaddleOCR/ppstructure/page_4.jpg \ --det_limit_side_len=736 \ --rec_model_dir=/scratch/rrs99/PaddleOCR/ppstructure/inference/en_ppocr_mobile_v2.0_table_rec_infer \ --table_model_dir=/scratch/rrs99/PaddleOCR/ppstructure/inference/en_ppocr_mobile_v2.0_table_structure_infer \ --det_model_dir=/scratch/rrs99/PaddleOCR/ppstructure/inference/en_ppocr_mobile_v2.0_table_det_infer \ --rec_char_dict_path=/scratch/rrs99/PaddleOCR/ppocr/utils/dict/table_dict.txt \ --table_char_dict_path=/scratch/rrs99/PaddleOCR/ppocr/utils/dict/table_structure_dict.txt \ --det_limit_type=min \ --output=/scratch/rrs99/PaddleOCR/ppstructure/output/table ```...

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment:Python 3.12 - 版本号/Version - Paddle:2.6.1 - PaddleOCR:2.7.3 - PaddleNLP: 2.6.1 (also tried 2.8.0, 2.5.x, 2.7.x) - 问题相关组件/Related...

bug

根据 https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/doc/doc_ch/models_list.md 路径,里面提到配置文件 比如:ch_PP-OCRv4_server_det模型的配置文件为:https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/det/ch_PP-OCRv4/ch_PP-OCRv4_det_teacher.yml 而实际官方给于的推理脚本中,没有提到需要指定配置文件: ```shell python3 predict_system.py \ --image_dir=./docs/table/1.png \ --det_model_dir=inference/en_PP-OCRv3_det_infer \ --rec_model_dir=inference/en_PP-OCRv3_rec_infer \ --rec_char_dict_path=../ppocr/utils/en_dict.txt \ --table_model_dir=inference/en_ppstructure_mobile_v2.0_SLANet_infer \ --table_char_dict_path=../ppocr/utils/dict/table_structure_dict.txt \ --layout_model_dir=inference/picodet_lcnet_x1_0_fgd_layout_infer \ --layout_dict_path=../ppocr/utils/dict/layout_dict/layout_publaynet_dict.txt \ --vis_font_path=../doc/fonts/simfang.ttf \ --recovery=True \ --output=../output/...

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem - 系统环境/System Environment:windows 10 - 版本号/Version:Paddle: 2.4.2.post116 PaddleOCR: 2.7.3问题相关组件/Related components: - 运行指令/Command Code: - 完整报错/Complete Error Message: [2024/05/22 14:06:59] ppocr...

bug