PaddleOCR
PaddleOCR copied to clipboard
KIE增值税发票信息抽取,RE任务模型训练,准确率和召回率一直是0直到训练结束
-
系统环境/System Environment: ubuntu18.4,python3.7
-
版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components: paddlepaddle=2.4.0rc0,paddlenlp=2.4.1
-
运行指令/Command Code: python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml
-
训练数据: [{"transcription": "广东增值税专用发票", "label": "question", "points": [[1648, 479], [2747, 469], [2749, 638], [1649, 648]], "id": 0, "linking": [[0, 24]]}, {"transcription": "2016年06月12日", "label": "answer", "points": [[3485, 789], [3954, 789], [3954, 851], [3485, 851]], "id": 1, "linking": [[1, 35]]}, {"transcription": "深圳市购机汇网络有限公司", "label": "answer", "points": [[1026, 962], [1766, 962], [1766, 1024], [1026, 1024]], "id": 2, "linking": [[2, 25]]}, {"transcription": "纳税人识别号:", "label": "question", "points": [[526, 1037], [947, 1037], [947, 1112], [526, 1112]], "id": 3, "linking": [[3, 4]]},...... train的副本.txt
-
完整报错/Complete Error Message: [2022/10/28 02:22:03] ppocr INFO: best metric, hmean: 0, precision: 0, recall: 0, fps: 11.429711280367338, best_epoch: 2000
我刚刚更新了一下最新的PaddleOCR-release-2.6代码,训练精度正常上升了,但是在使用动态图和inference模型预测的时候,报错:
Traceback (most recent call last):
File "./tools/infer_kie_token_ser_re.py", line 217, in
你的paddlenlp更新是啥版本,需要2.4.1+
使用最新版本的库和代码,精度上来了
使用最新版本的库和代码,精度上来了
您好,我也遇到了一样的问题,但是我之前训是没有问题的,现在环境搞混了也不知道是哪个包出了问题,您这边的环境配置能详细说明一下吗
同遇到ValueError: (InvalidArgument) The shape of tensor assigned value must match the shape of target shape: [512, 3], but now shape is [513, 3]. (at /paddle/paddle/phi/kernels/impl/set_value_kernel_impl.h:68) 怎么解决?
你好 有遇到KIE提取信息不全的问题吗?
- 系统环境/System Environment: ubuntu18.4,python3.7
- 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components: paddlepaddle=2.4.0rc0,paddlenlp=2.4.1
- 运行指令/Command Code: python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/kie/vi_layoutxlm/re_vi_layoutxlm_xfund_zh.yml
- 训练数据: [{"transcription": "广东增值税专用发票", "label": "question", "points": [[1648, 479], [2747, 469], [2749, 638], [1649, 648]], "id": 0, "linking": [[0, 24]]}, {"transcription": "2016年06月12日", "label": "answer", "points": [[3485, 789], [3954, 789], [3954, 851], [3485, 851]], "id": 1, "linking": [[1, 35]]}, {"transcription": "深圳市购机汇网络有限公司", "label": "answer", "points": [[1026, 962], [1766, 962], [1766, 1024], [1026, 1024]], "id": 2, "linking": [[2, 25]]}, {"transcription": "纳税人识别号:", "label": "question", "points": [[526, 1037], [947, 1037], [947, 1112], [526, 1112]], "id": 3, "linking": [[3, 4]]},...... train的副本.txt
- 配置文件: re_vi_layoutxlm_xfund_zh的副本.txt
- 完整报错/Complete Error Message: [2022/10/28 02:22:03] ppocr INFO: best metric, hmean: 0, precision: 0, recall: 0, fps: 11.429711280367338, best_epoch: 2000
请问下 训练数据中的 "linking": [[1, 35]] 是如何标注的?求分享
同遇到ValueError: (InvalidArgument) The shape of tensor assigned value must match the shape of target shape: [512, 3], but now shape is [513, 3]. (at /paddle/paddle/phi/kernels/impl/set_value_kernel_impl.h:68) 怎么解决?
啥原因 解决了吗 是paddlenlp转换数据有问题吗
我刚刚更新了一下最新的PaddleOCR-release-2.6代码,训练精度正常上升了,但是在使用动态图和inference模型预测的时候,报错:
Traceback (most recent call last): File "./tools/infer_kie_token_ser_re.py", line 217, in result = ser_re_engine(data) File "./tools/infer_kie_token_ser_re.py", line 151, in call preds = self.model(re_input) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in call return self.forward(*inputs, **kwargs) File "/home/sxct-ocr/api/PaddleOCR-release-2.6-new/ppocr/modeling/architectures/base_model.py", line 86, in forward x = self.backbone(x) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in call return self.forward(*inputs, **kwargs) File "/home/sxct-ocr/api/PaddleOCR-release-2.6-new/ppocr/modeling/backbones/vqa_layoutlm.py", line 237, in forward relations=relations) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in call return self.forward(*inputs, **kwargs) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1559, in forward relations) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in call return self.forward(*inputs, **kwargs) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1425, in forward relations, entities = self.build_relation(relations, entities) File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddlenlp/transformers/layoutxlm/modeling.py", line 1346, in build_relation entities[b] = entitie_new File "/usr/local/python3.7.0/lib/python3.7/site-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 797, in setitem return self.setitem_eager_tensor(item, value) ValueError: (InvalidArgument) The shape of tensor assigned value must match the shape of target shape: [512, 3], but now shape is [513, 3]. (at /paddle/paddle/phi/kernels/impl/set_value_kernel_impl.h:68) [operator < set_value > error]
什么原因导致的?
你好 有遇到KIE提取信息不全的问题吗?
我遇到了 是re 和 ser 只预测第一个512序列块,后面的就扔掉不管了