PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

added sliding window for large image inference

Open aspaul20 opened this issue 9 months ago • 11 comments

PaddleOCR does not work on large documents/images, this feature consists of a sliding window inference method, which although takes longer (expectedly), uses a sliding window to create slices of the input image and run detection+recognition on it. Unlike the default code, it gives correct results. The vertical and horizontal strides are adjustable by the user.

Output on an image of dimensions (5088x3600):

Without sliding window: Screenshot from 2024-05-21 17-06-11

With sliding window: Screenshot from 2024-05-21 17-07-11

Note: It could use a postprocessing step where the adjacent detections are merged into one, if needed.

aspaul20 avatar May 21 '24 12:05 aspaul20