PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Characters such as '-', '$', '€' not being detected by paddleocr

Open saanvib13 opened this issue 9 months ago • 1 comments

I am using the tablebank layout detector and the ocr model of paddleocr to detect tables in an image and extract the text in the detected table to a csv file. My code looks like following: model1=lp.PaddleDetectionLayoutModel(config_path="lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_word/config", threshold=0.8, label_map={0:"Table"}, enforce_cpu=False, enable_mkldnn=True)
layout1=model1.detect(img) for l in layout2: if l.type == "Table": output = self.ocr.ocr(loaded_image)[0]

When I execute this on an image, it detects a table in it and its text like the following, WhatsApp Image 2024-05-23 at 5 15 24 PM

However it sometimes omits special characters such as '-', '$', '€'. When I saved this into a csv, the result looks like following WhatsApp Image 2024-05-23 at 5 16 51 PM

The negative symbol gets detected sometimes and gets omitted at other times. How to solve this inconsistency?

saanvib13 avatar May 23 '24 11:05 saanvib13