PaddleOCR 关键信息提取 max_seq

关键信息提取 max_seq_len参数设置

Open azhaoid opened this issue 1 year ago • 1 comments

关键信息提取的max_seq_len长度最长只有514吗？不能超出吗？

运行命令：python tools/train.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml

报错信息： alueError: (InvalidArgument) The value (514) of the non-singleton dimension does not match the corresponding value (1024) in shape for expand_v2 op. [Hint: Expected vec_in_dims[i] == expand_shape[i], but received vec_in_dims[i]:514 != expand_shape[i]:1024.] (at /paddle/paddle/phi/kernels/impl/expand_kernel_impl.h:61)

配置文件：configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml

Train: dataset: name: SimpleDataSet data_dir: label_file_list: - ratio_list: [ 1.0 ] transforms: - DecodeImage: # load image img_mode: RGB channel_first: False - VQATokenLabelEncode: # Class handling label contains_re: False algorithm: *algorithm class_path: *class_path use_textline_bbox_info: &use_textline_bbox_info True # one of [None, "tb-yx"] order_method: &order_method "tb-yx" - VQATokenPad: max_seq_len: &max_seq_len 1024 return_attention_mask: True - VQASerTokenChunk: max_seq_len: *max_seq_len - Resize: size: [224,224] - NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order loader: shuffle: True drop_last: False batch_size_per_card: 1 num_workers: 8

May 10 '23 01:05 azhaoid

可以看看这里面的代码，site-packages/paddlenlp/transformers/layoutxlm/modeling.py

Apr 24 '24 02:04 tran601

该issue长时间未更新，暂将此issue关闭，如有需要可重新开启。

May 10 '24 02:05 UserWangZz

PaddleOCR PaddleOCR copied to clipboard

关键信息提取 max_seq_len参数设置

PaddleOCR
PaddleOCR copied to clipboard