PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

在使用ocr时遇到crop by polys 问题

Open TimothyZero opened this issue 6 months ago • 4 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

Paddlex的crop by polys会在sample points on Bbox 时生成<三个点的points,最终导致ocr crop失败

🏃‍♂️ Environment (运行环境)

都是3.0.0

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

sample_points_on_bbox(np.array([[436, 188], [432, 188], [432, 191], [427, 194], [424, 194], [416, 188], [403, 188], [399, 185], [399, 183], [398, 181], [396, 178], [399, 177], [436, 177], [439, 180], [439, 185]]))

TimothyZero avatar Jun 07 '25 08:06 TimothyZero

您好,可以稍微再描述细致一些吗?

cuicheng01 avatar Jun 07 '25 09:06 cuicheng01

就是我在做ocr时,text_type设置为seal时会使用CropByPolys剪裁文本行,然后在把文本行送去rec的到最终识别结果。但是问题是CropByPolys.sample_points_on_bbox会对polys points进行降采样导致一些正常的多边形降为3个点的,然后3个点无法进行剪裁会报错assert len(points) == 4, "shape of points must be 4*2"

TimothyZero avatar Jun 08 '25 01:06 TimothyZero

您好,方便提供一下代码和图像吗?我们复现排查下

Sunting78 avatar Jun 10 '25 07:06 Sunting78

图和代码不太方便,但是你可以直接跑这个样例,sample_points_on_bbox(np.array([[436, 188], [432, 188], [432, 191], [427, 194], [424, 194], [416, 188], [403, 188], [399, 185], [399, 183], [398, 181], [396, 178], [399, 177], [436, 177], [439, 180], [439, 185]])),这个代码应保证输出至少4个点

TimothyZero avatar Jun 12 '25 07:06 TimothyZero