PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

ppocr training kie model发票识别训练-数据报错

Open AlenChuan opened this issue 2 years ago • 1 comments

PLZ. How can I deal with the following 3 errors?请问我如何解决如下3个报错?

1.ppocr ERROR: When parsing line invoice0.jpg [{"transcription": "142011471003", "label":"invoice_code","points": [[115, 93], [197, 93], [197, 105], [115, 105]],"id": 1, "linking": [[1, 2]]}, {"transcription": "13435004", "label":"invoice_no","points": [[117, 110], [184, 110], [184, 123], [117, 123]],"id": 2, "linking": [[2, 3]]}, {"transcription": "A-XX612", "label":"car_no","points": [[149, 177], [201, 177], [201, 191], [149, 191]],"id": 3, "linking": [[3, 4]]}, {"transcription": "035409", "label":"certificate_no","points": [[155, 194], [200, 194], [200, 206], [155, 206]],"id": 4, "linking": [[4, 5]]}, {"transcription": "2040年00月00日", "label":"date","points": [[122, 206], [198, 206], [198, 222], [122, 222]],"id": 5, "linking": [[5, 6]]}, {"transcription": "00:00", "label":"pickup_time","points": [[161, 224], [200, 224], [200, 236], [161, 236]],"id": 6, "linking": [[6, 7]]}, {"transcription": "03:14", "label":"getoff_time","points": [[162, 238], [199, 238], [199, 250], [162, 250]],"id": 7, "linking": [[7, 8]]}, {"transcription": "3.00元", "label":"unit_price","points": [[162, 252], [199, 252], [199, 265], [162, 265]],"id": 8, "linking": [[8, 9]]}, {"transcription": "361.63km", "label":"mileage","points": [[143, 266], [197, 266], [197, 281], [143, 281]],"id": 9, "linking": [[9, 10]]}, {"transcription": "01:13.28", "label":"waiting_time","points": [[142, 282], [198, 282], [198, 293], [142, 293]],"id": 10, "linking": [[10, 11]]}, {"transcription": "1111.00元", "label":"price","points": [[143, 296], [199, 296], [199, 311], [143, 311]],"id": 11, "linking": [[11, 12]]}, {"transcription": "00000000", "label":"card_no","points": [[143, 313], [199, 313], [199, 324], [143, 324]],"id": 12, "linking": [[12, 13]]}, {"transcription": "0.0元", "label":"original_amount","points": [[166, 327], [198, 327], [198, 340], [166, 340]],"id": 13, "linking": [[13, 14]]}, {"transcription": "0.0元", "label":"remaining_amount","points": [[169, 357], [197, 357], [197, 345], [169, 345]],"id": 14, "linking": [[14, 15]]}], error happened with msg: Traceback (most recent call last): File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\simple_dataset.py", line 137, in getitem outs = transform(data, self.ops) File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\imaug_init_.py", line 56, in transform data = op(data) File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\imaug\label_ops.py", line 1093, in call gt_label = self._parse_label(label, encode_res) File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\imaug\label_ops.py", line 1177, in _parse_label gt_label.append(self.label2id_map[("b-" + label).upper()]) KeyError: 'B-INVOICE_CODE'

2.Fatal Python error: Cannot recover from stack overflow.

3.Current thread 0x00001924 (most recent call first): File "E:\anaconda\envs\ocrpy37\lib\abc.py", line 139 in instancecheck File "E:\anaconda\envs\ocrpy37\lib_collections_abc.py", line 839 in update File "E:\anaconda\envs\ocrpy37\lib\collections_init_.py", line 1018 in init File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 206 in init File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 2614 in pad File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 2843 in prepare_for_model File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddlenlp\transformers\tokenizer_utils.py", line 1037 in encode_plus File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddlenlp\transformers\tokenizer_utils_base.py", line 2349 in encode File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\imaug\label_ops.py", line 1068 in call File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\imaug_init.py", line 56 in transform File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\simple_dataset.py", line 137 in getitem File "D:\BOCFT\2022-07-07~ocr\ocr-invoice\PaddleOCR\ppocr\data\simple_dataset.py", line 147 in getitem ……

Thread 0x00002e18 (most recent call first): File "E:\anaconda\envs\ocrpy37\lib\site-packages\paddle\fluid\dataloader\dataloader_iter.py", line 272 in next File "tools/train.py", line 192 in test_reader File "tools/train.py", line 209 in

AlenChuan avatar Oct 27 '22 09:10 AlenChuan

errors above come out when I enter this: python tools/train.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml -o Global.save_model_dir=./output/kie/

AlenChuan avatar Oct 27 '22 09:10 AlenChuan

可能你的标记数据中 label有INVOICE_CODE。但是class_list.txt中没有加进去

rexzhengzhihong avatar Feb 01 '23 09:02 rexzhengzhihong

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 08 '23 02:07 github-actions[bot]