PaddleOCR
PaddleOCR copied to clipboard
ValueError: all input arrays must have the same shape When Training SER model using XFUNDS dataset
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem I tryed to train a ser model with multi language from xfunds and found this problem, so i tryed using a single language but found the same problem
-
系统环境/System Environment:windows
-
版本号/Version:Paddle:2.4.2 PaddleOCR:2.6 问题相关组件/Related components:paddlenlp 2.5.1
-
完整报错/Complete Error Message:
Exception in thread Thread-4: Traceback (most recent call last): File "C:\Users\andre\anaconda3\envs\keyv\lib\threading.py", line 932, in _bootstrap_inner self.run() File "C:\Users\andre\anaconda3\envs\keyv\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\dataloader_iter.py", line 217, in _thread_loop batch = self._dataset_fetcher.fetch(indices, File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\fetcher.py", line 138, in fetch data = self.collate_fn(data) File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 77, in default_collate_fn return [default_collate_fn(fields) for fields in zip(*batch)] File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 77, in <listcomp> return [default_collate_fn(fields) for fields in zip(*batch)] File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\paddle\fluid\dataloader\collate.py", line 58, in default_collate_fn batch = np.stack(batch, axis=0) File "<__array_function__ internals>", line 5, in stack File "C:\Users\andre\anaconda3\envs\keyv\lib\site-packages\numpy\core\shape_base.py", line 427, in stack raise ValueError('all input arrays must have the same shape') ValueError: all input arrays must have the same shape
and this is my config.yml:
Global:
use_gpu: False epoch_num: &epoch_num 200 log_smooth_window: 10 print_batch_step: 10 save_model_dir: .app/output/ser_model save_epoch_step: 2000 #evaluation is run every 10 iterations after the 0th iteration eval_batch_step: [ 0, 19 ] cal_metric_during_train: False save_inference_dir: use_visualdl: False seed: 2022 infer_img: app/kie/test.jpg #if you want to predict using the groundtruth ocr info, #you can use the following config #infer_img: train_data/XFUND/zh_val/val.json infer_mode: False
save_res_path: .app/output/ser/res kie_rec_model_dir: kie_det_model_dir:
Architecture:
model_type: kie algorithm: &algorithm "LayoutXLM" Transform: Backbone: name: LayoutXLMForSer pretrained: True checkpoints: #one of base or vi mode: vi num_classes: &num_classes 7
Loss:
name: VQASerTokenLayoutLMLoss num_classes: *num_classes key: "backbone_out"
Optimizer:
name: AdamW beta1: 0.9 beta2: 0.999 lr: name: Linear learning_rate: 0.00005 epochs: *epoch_num warmup_epoch: 2 regularizer: name: L2 factor: 0.00000
PostProcess:
name: VQASerTokenLayoutLMPostProcess class_path: &class_path app/kie/class_list_xfun.txt
Metric:
name: VQASerTokenMetric main_indicator: hmean
Train:
dataset: name: SimpleDataSet data_dir: app/training/train_data/train label_file_list: - app/training/train_data/train.json ratio_list: [ 1.0 ] transforms: - DecodeImage: # load image img_mode: RGB channel_first: False - VQATokenLabelEncode: # Class handling label contains_re: False algorithm: *algorithm class_path: *class_path use_textline_bbox_info: &use_textline_bbox_info True #one of [None, "tb-yx"] order_method: &order_method "tb-yx" - VQATokenPad: max_seq_len: &max_seq_len 512 return_attention_mask: True - VQASerTokenChunk: max_seq_len: *max_seq_len - Resize: size: [224,224] - NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order loader: shuffle: True drop_last: False batch_size_per_card: 3 num_workers: 4
Eval:
dataset: name: SimpleDataSet data_dir: app/training/train_data/val label_file_list: - app/training/train_data/val.json transforms: - DecodeImage: #load image img_mode: RGB channel_first: False - VQATokenLabelEncode: #Class handling label contains_re: False algorithm: *algorithm class_path: *class_path use_textline_bbox_info: *use_textline_bbox_info order_method: *order_method - VQATokenPad: max_seq_len: *max_seq_len return_attention_mask: True - VQASerTokenChunk: max_seq_len: *max_seq_len - Resize: size: [224,224] - NormalizeImage: scale: 1 mean: [ 123.675, 116.28, 103.53 ] std: [ 58.395, 57.12, 57.375 ] order: 'hwc' - ToCHWImage: - KeepKeys: keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order loader: shuffle: False drop_last: False batch_size_per_card: 4 num_workers: 4
profiler_options: "None"
EvalReader:
collate_batch: false
TrainReader:
collate_batch: false`
您好,你的问题解决了吗?
Same here
验证的batch_size_per_card改为1
验证的batch_size_per_card改为1
有不改1的方法,改1太慢了