PaddleOCR
PaddleOCR copied to clipboard
KIE中的SER模型训练报错
win11 paddlenlp 2.3.1 paddleocr 2.5.0.3 paddlepaddle-gpu 0.0.0.post110
ser_vi_layoutxlm_xfund_zh.yml Architecture: model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForSer pretrained: true checkpoints: # one of base or vi mode: vi num_classes: &num_classes 7
因为网络下载模型有问题,所以在paddlenlp.transformers.model_utils中直接写了本地模型路径(PaadleOCR提供的预训练模型)
'./inference/ser_vi_layoutxlm_xfund_pretrained/best_accuracy/model_state.pdparams'
Traceback (most recent call last):
File "tools/train.py", line 201, in
不确定这个修改是否正确,但是修改后训练正常 修改:
模型网络输出层修改
ppocr.modeling.backbones.vqa_layoutlm class LayoutLMv2ForSer(NLPBaseModel) if self.training: res = {"backbone_out": x[0]} # res.update(x[1]) 注释掉 return res else: return x
模型加载
paddlenlp.transformers.model_utils 模型加载的path直接path = './inference/model_state.pdparams' 模型网上提前下载好。 pretrained_resource_files_map = { "model_state": { "layoutxlm-base-uncased": "https://bj.bcebos.com/paddlenlp/models/transformers/layoutxlm_base/model_state.pdparams", } } 上面模型的下载,在win上面最后一个'/'会变成''导致无法下载。指定本地模型也可以解决无法联网的问题。
疑问:ser_vi_layoutxlm_xfund_pretrained中提供的预训练模型加载会报错。 试过checkpoints的再训练也不行,而且**_udml还不支持,不知道后续有没有相关优化。
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.