PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

ValueError: all input arrays must have the same shape When Training SER model using XFUNDS dataset

Open andreaIskanderBelkhir opened this issue 2 years ago • 3 comments

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem I tryed to train a ser model with multi language from xfunds and found this problem, so i tryed using a single language but found the same problem

  • 系统环境/System Environment:windows

  • 版本号/Version:Paddle:2.4.2 PaddleOCR:2.6 问题相关组件/Related components:paddlenlp 2.5.1

  • 完整报错/Complete Error Message: Exception in thread Thread-4: Traceback (most recent call last): File "C:\Users\andre\anaconda3\envs\keyv\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\Users\andre\anaconda3\envs\keyv\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\dataloader_iter.py", line 217, in _thread_loop batch = self._dataset_fetcher.fetch(indices, File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\fetcher.py", line 138, in fetch data = self.collate_fn(data) File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 77, in default_collate_fn return [default_collate_fn(fields) for fields in zip(*batch)] File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 77, in <listcomp> return [default_collate_fn(fields) for fields in zip(*batch)] File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 58, in default_collate_fn batch = np.stack(batch, axis=0) File "<__array_function__ internals>", line 5, in stack File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\numpy\core\shape_base.py", line 427, in stack raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape

and this is my config.yml:

Global:

use_gpu: False epoch_num: &epoch_num 200 log_smooth_window: 10 print_batch_step: 10 save_model_dir: .app/output/ser_model save_epoch_step: 2000 #evaluation is run every 10 iterations after the 0th iteration eval_batch_step: [ 0, 19 ] cal_metric_during_train: False save_inference_dir: use_visualdl: False seed: 2022 infer_img: app/kie/test.jpg #if you want to predict using the groundtruth ocr info, #you can use the following config #infer_img: train_data/XFUND/zh_val/val.json infer_mode: False

save_res_path: .app/output/ser/res kie_rec_model_dir: kie_det_model_dir:

Architecture:

model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForSer pretrained: True checkpoints: #one of base or vi mode: vi num_classes: &num_classes 7

Loss:

name: VQASerTokenLayoutLMLoss num_classes: *num_classes key: "backbone_out"

Optimizer:

name: AdamW beta1: 0.9 beta2: 0.999 lr: name: Linear learning_rate: 0.00005 epochs: *epoch_num warmup_epoch: 2 regularizer: name: L2 factor: 0.00000

PostProcess:

name: VQASerTokenLayoutLMPostProcess class_path: &class_path app/kie/class_list_xfun.txt

Metric:

name: VQASerTokenMetric main_indicator: hmean

Train:

dataset: name: SimpleDataSet data_dir: app/training/train_data/train label_file_list: - app/training/train_data/train.json ratio_list: [ 1.0 ] transforms: - DecodeImage: # load image img_mode: RGB channel_first: False - VQATokenLabelEncode: # Class handling label contains_re: False algorithm: *algorithm class_path: *class_path use_textline_bbox_info: &use_textline_bbox_info True #one of [None, "tb-yx"] order_method: &order_method "tb-yx" - VQATokenPad: max_seq_len: &max_seq_len 512 return_attention_mask: True - VQASerTokenChunk: max_seq_len: *max_seq_len - Resize: size: [224,224] - NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order loader: shuffle: True drop_last: False batch_size_per_card: 3 num_workers: 4

Eval:

dataset: name: SimpleDataSet data_dir: app/training/train_data/val label_file_list: - app/training/train_data/val.json transforms: - DecodeImage: #load image img_mode: RGB channel_first: False - VQATokenLabelEncode: #Class handling label contains_re: False algorithm: *algorithm class_path: *class_path use_textline_bbox_info: *use_textline_bbox_info order_method: *order_method - VQATokenPad: max_seq_len: *max_seq_len return_attention_mask: True - VQASerTokenChunk: max_seq_len: *max_seq_len - Resize: size: [224,224] - NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 4 num_workers: 4

profiler_options: "None"

EvalReader:

collate_batch: false

TrainReader:

collate_batch: false`

andreaIskanderBelkhir avatar Nov 06 '23 11:11 andreaIskanderBelkhir

您好,你的问题解决了吗?

zhengmeng avatar May 16 '24 05:05 zhengmeng

Same here

danielcmm avatar May 16 '24 11:05 danielcmm

验证的batch_size_per_card改为1

hortionelson-1805 avatar Jun 28 '24 02:06 hortionelson-1805

验证的batch_size_per_card改为1

有不改1的方法,改1太慢了

jesszgc avatar Oct 31 '24 02:10 jesszgc