PaddleOCR Cant train KIE

wherever i tried to train the KIE mdoel i get this error when running on gpu cuda 10.2 and when running on cpu it throws target 41 is out of upper bound here is the script i run and the resulting error (paddleocr) PS C:\Work\CENATAV\Libraries\PaddleOCR> python tools/train.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh_udml.yml [2022/10/20 22:22:45] ppocr INFO: Architecture : [2022/10/20 22:22:45] ppocr INFO: Models : [2022/10/20 22:22:45] ppocr INFO: Student : [2022/10/20 22:22:45] ppocr INFO: Backbone : [2022/10/20 22:22:45] ppocr INFO: checkpoints : None [2022/10/20 22:22:45] ppocr INFO: mode : vi [2022/10/20 22:22:45] ppocr INFO: name : LayoutXLMForSer [2022/10/20 22:22:45] ppocr INFO: num_classes : 23 [2022/10/20 22:22:45] ppocr INFO: pretrained : True [2022/10/20 22:22:45] ppocr INFO: Transform : None [2022/10/20 22:22:45] ppocr INFO: algorithm : LayoutXLM [2022/10/20 22:22:45] ppocr INFO: freeze_params : False [2022/10/20 22:22:45] ppocr INFO: model_type : kie [2022/10/20 22:22:45] ppocr INFO: pretrained : None [2022/10/20 22:22:45] ppocr INFO: return_all_feats : True [2022/10/20 22:22:45] ppocr INFO: Teacher : [2022/10/20 22:22:45] ppocr INFO: Backbone : [2022/10/20 22:22:45] ppocr INFO: checkpoints : None [2022/10/20 22:22:45] ppocr INFO: mode : vi [2022/10/20 22:22:45] ppocr INFO: name : LayoutXLMForSer [2022/10/20 22:22:45] ppocr INFO: num_classes : 23 [2022/10/20 22:22:45] ppocr INFO: pretrained : True [2022/10/20 22:22:45] ppocr INFO: Transform : None [2022/10/20 22:22:45] ppocr INFO: algorithm : LayoutXLM [2022/10/20 22:22:45] ppocr INFO: freeze_params : False [2022/10/20 22:22:45] ppocr INFO: model_type : kie [2022/10/20 22:22:45] ppocr INFO: pretrained : None [2022/10/20 22:22:45] ppocr INFO: return_all_feats : True [2022/10/20 22:22:45] ppocr INFO: algorithm : Distillation [2022/10/20 22:22:45] ppocr INFO: model_type : kie [2022/10/20 22:22:45] ppocr INFO: name : DistillationModel [2022/10/20 22:22:45] ppocr INFO: Eval : [2022/10/20 22:22:45] ppocr INFO: dataset : [2022/10/20 22:22:45] ppocr INFO: data_dir : train_data/det/val [2022/10/20 22:22:45] ppocr INFO: label_file_list : ['train_data/det/val_kie.txt'] [2022/10/20 22:22:45] ppocr INFO: name : SimpleDataSet [2022/10/20 22:22:45] ppocr INFO: transforms : [2022/10/20 22:22:45] ppocr INFO: DecodeImage : [2022/10/20 22:22:45] ppocr INFO: channel_first : False [2022/10/20 22:22:45] ppocr INFO: img_mode : RGB [2022/10/20 22:22:45] ppocr INFO: VQATokenLabelEncode : [2022/10/20 22:22:45] ppocr INFO: algorithm : LayoutXLM [2022/10/20 22:22:45] ppocr INFO: class_path : train_data/det/kie_dict.txt [2022/10/20 22:22:45] ppocr INFO: contains_re : False [2022/10/20 22:22:45] ppocr INFO: order_method : tb-yx [2022/10/20 22:22:45] ppocr INFO: VQATokenPad : [2022/10/20 22:22:45] ppocr INFO: max_seq_len : 512 [2022/10/20 22:22:45] ppocr INFO: return_attention_mask : True [2022/10/20 22:22:45] ppocr INFO: VQASerTokenChunk : [2022/10/20 22:22:45] ppocr INFO: max_seq_len : 512 [2022/10/20 22:22:45] ppocr INFO: Resize : [2022/10/20 22:22:45] ppocr INFO: size : [224, 224] [2022/10/20 22:22:45] ppocr INFO: NormalizeImage : [2022/10/20 22:22:45] ppocr INFO: mean : [123.675, 116.28, 103.53] [2022/10/20 22:22:45] ppocr INFO: order : hwc [2022/10/20 22:22:45] ppocr INFO: scale : 1 [2022/10/20 22:22:45] ppocr INFO: std : [58.395, 57.12, 57.375] [2022/10/20 22:22:45] ppocr INFO: ToCHWImage : None [2022/10/20 22:22:45] ppocr INFO: KeepKeys : [2022/10/20 22:22:45] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] [2022/10/20 22:22:45] ppocr INFO: loader : [2022/10/20 22:22:45] ppocr INFO: batch_size_per_card : 1 [2022/10/20 22:22:45] ppocr INFO: drop_last : False [2022/10/20 22:22:45] ppocr INFO: num_workers : 2 [2022/10/20 22:22:45] ppocr INFO: shuffle : False [2022/10/20 22:22:45] ppocr INFO: Global : [2022/10/20 22:22:45] ppocr INFO: cal_metric_during_train : False [2022/10/20 22:22:45] ppocr INFO: distributed : False [2022/10/20 22:22:45] ppocr INFO: epoch_num : 200 [2022/10/20 22:22:45] ppocr INFO: eval_batch_step : [0, 19] [2022/10/20 22:22:45] ppocr INFO: infer_img : ppstructure/docs/kie/input/zh_val_42.jpg [2022/10/20 22:22:45] ppocr INFO: log_smooth_window : 10 [2022/10/20 22:22:45] ppocr INFO: print_batch_step : 10 [2022/10/20 22:22:45] ppocr INFO: save_epoch_step : 2000 [2022/10/20 22:22:45] ppocr INFO: save_inference_dir : None [2022/10/20 22:22:45] ppocr INFO: save_model_dir : ./output/ser_vi_layoutxlm_xfund_zh_udml [2022/10/20 22:22:45] ppocr INFO: save_res_path : ./output/ser_layoutxlm_xfund_zh/res [2022/10/20 22:22:45] ppocr INFO: seed : 2022 [2022/10/20 22:22:45] ppocr INFO: use_gpu : True [2022/10/20 22:22:45] ppocr INFO: use_visualdl : True [2022/10/20 22:22:45] ppocr INFO: Loss : [2022/10/20 22:22:45] ppocr INFO: loss_config_list : [2022/10/20 22:22:45] ppocr INFO: DistillationVQASerTokenLayoutLMLoss : [2022/10/20 22:22:45] ppocr INFO: key : backbone_out [2022/10/20 22:22:45] ppocr INFO: model_name_list : ['Student', 'Teacher'] [2022/10/20 22:22:45] ppocr INFO: num_classes : 23 [2022/10/20 22:22:45] ppocr INFO: weight : 1.0 [2022/10/20 22:22:45] ppocr INFO: DistillationSERDMLLoss : [2022/10/20 22:22:45] ppocr INFO: act : softmax [2022/10/20 22:22:45] ppocr INFO: key : backbone_out [2022/10/20 22:22:45] ppocr INFO: model_name_pairs : [['Student', 'Teacher']] [2022/10/20 22:22:45] ppocr INFO: use_log : True [2022/10/20 22:22:45] ppocr INFO: weight : 1.0 [2022/10/20 22:22:45] ppocr INFO: DistillationVQADistanceLoss : [2022/10/20 22:22:45] ppocr INFO: key : hidden_states_5 [2022/10/20 22:22:45] ppocr INFO: mode : l2 [2022/10/20 22:22:45] ppocr INFO: model_name_pairs : [['Student', 'Teacher']] [2022/10/20 22:22:45] ppocr INFO: name : loss_5 [2022/10/20 22:22:45] ppocr INFO: weight : 0.5 [2022/10/20 22:22:45] ppocr INFO: DistillationVQADistanceLoss : [2022/10/20 22:22:45] ppocr INFO: key : hidden_states_8 [2022/10/20 22:22:45] ppocr INFO: mode : l2 [2022/10/20 22:22:45] ppocr INFO: model_name_pairs : [['Student', 'Teacher']] [2022/10/20 22:22:45] ppocr INFO: name : loss_8 [2022/10/20 22:22:45] ppocr INFO: weight : 0.5 [2022/10/20 22:22:45] ppocr INFO: name : CombinedLoss [2022/10/20 22:22:45] ppocr INFO: Metric : [2022/10/20 22:22:45] ppocr INFO: base_metric_name : VQASerTokenMetric [2022/10/20 22:22:45] ppocr INFO: key : Student [2022/10/20 22:22:45] ppocr INFO: main_indicator : hmean [2022/10/20 22:22:45] ppocr INFO: name : DistillationMetric [2022/10/20 22:22:45] ppocr INFO: Optimizer : [2022/10/20 22:22:45] ppocr INFO: beta1 : 0.9 [2022/10/20 22:22:45] ppocr INFO: beta2 : 0.999 [2022/10/20 22:22:45] ppocr INFO: lr : [2022/10/20 22:22:45] ppocr INFO: epochs : 200 [2022/10/20 22:22:45] ppocr INFO: learning_rate : 5e-05 [2022/10/20 22:22:45] ppocr INFO: name : Linear [2022/10/20 22:22:45] ppocr INFO: warmup_epoch : 10 [2022/10/20 22:22:45] ppocr INFO: name : AdamW [2022/10/20 22:22:45] ppocr INFO: regularizer : [2022/10/20 22:22:45] ppocr INFO: factor : 0.0 [2022/10/20 22:22:45] ppocr INFO: name : L2 [2022/10/20 22:22:45] ppocr INFO: PostProcess : [2022/10/20 22:22:45] ppocr INFO: class_path : train_data/det/kie_dict.txt [2022/10/20 22:22:45] ppocr INFO: key : backbone_out [2022/10/20 22:22:45] ppocr INFO: model_name : ['Student', 'Teacher'] [2022/10/20 22:22:45] ppocr INFO: name : DistillationSerPostProcess [2022/10/20 22:22:45] ppocr INFO: Train : [2022/10/20 22:22:45] ppocr INFO: dataset : [2022/10/20 22:22:45] ppocr INFO: data_dir : train_data/det/train [2022/10/20 22:22:45] ppocr INFO: label_file_list : ['train_data/det/train_kie.txt'] [2022/10/20 22:22:45] ppocr INFO: name : SimpleDataSet [2022/10/20 22:22:45] ppocr INFO: ratio_list : [1.0] [2022/10/20 22:22:45] ppocr INFO: transforms : [2022/10/20 22:22:45] ppocr INFO: DecodeImage : [2022/10/20 22:22:45] ppocr INFO: channel_first : False [2022/10/20 22:22:45] ppocr INFO: img_mode : RGB [2022/10/20 22:22:45] ppocr INFO: VQATokenLabelEncode : [2022/10/20 22:22:45] ppocr INFO: algorithm : LayoutXLM [2022/10/20 22:22:45] ppocr INFO: class_path : train_data/det/kie_dict.txt [2022/10/20 22:22:45] ppocr INFO: contains_re : False [2022/10/20 22:22:45] ppocr INFO: order_method : tb-yx [2022/10/20 22:22:45] ppocr INFO: VQATokenPad : [2022/10/20 22:22:45] ppocr INFO: max_seq_len : 512 [2022/10/20 22:22:45] ppocr INFO: return_attention_mask : True [2022/10/20 22:22:45] ppocr INFO: VQASerTokenChunk : [2022/10/20 22:22:45] ppocr INFO: max_seq_len : 512 [2022/10/20 22:22:45] ppocr INFO: Resize : [2022/10/20 22:22:45] ppocr INFO: size : [224, 224] [2022/10/20 22:22:45] ppocr INFO: NormalizeImage : [2022/10/20 22:22:45] ppocr INFO: mean : [123.675, 116.28, 103.53] [2022/10/20 22:22:45] ppocr INFO: order : hwc [2022/10/20 22:22:45] ppocr INFO: scale : 1 [2022/10/20 22:22:45] ppocr INFO: std : [58.395, 57.12, 57.375] [2022/10/20 22:22:45] ppocr INFO: ToCHWImage : None [2022/10/20 22:22:45] ppocr INFO: KeepKeys : [2022/10/20 22:22:45] ppocr INFO: keep_keys : ['input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] [2022/10/20 22:22:45] ppocr INFO: loader : [2022/10/20 22:22:45] ppocr INFO: batch_size_per_card : 1 [2022/10/20 22:22:45] ppocr INFO: drop_last : False [2022/10/20 22:22:45] ppocr INFO: num_workers : 2 [2022/10/20 22:22:45] ppocr INFO: shuffle : True [2022/10/20 22:22:45] ppocr INFO: profiler_options : None [2022/10/20 22:22:45] ppocr INFO: train with paddle 2.3.1 and device Place(gpu:0) [2022/10/20 22:22:45] ppocr INFO: Initialize indexs of datasets:['train_data/det/train_kie.txt'] [2022-10-20 22:22:46,078] [ INFO] - Already cached C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\sentencepiece.bpe.model [2022-10-20 22:22:46,522] [ INFO] - tokenizer config file saved in C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\tokenizer_config.json [2022-10-20 22:22:46,523] [ INFO] - Special tokens file saved in C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\special_tokens_map.json [2022/10/20 22:22:46] ppocr INFO: Initialize indexs of datasets:['train_data/det/val_kie.txt'] [2022-10-20 22:22:46,525] [ INFO] - Already cached C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\sentencepiece.bpe.model [2022-10-20 22:22:46,946] [ INFO] - tokenizer config file saved in C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\tokenizer_config.json [2022-10-20 22:22:46,947] [ INFO] - Special tokens file saved in C:\Users\Ruben.paddlenlp\models\layoutxlm-base-uncased\special_tokens_map.json [2022-10-20 22:22:46,950] [ INFO] - Already cached C:\Users\Ruben.paddlenlp\models\vi-layoutxlm-base-uncased\model_state.pdparams W1020 22:22:46.951820 32172 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 10.2 W1020 22:22:46.964820 32172 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. [2022-10-20 22:22:49,838] [ INFO] - Already cached C:\Users\Ruben.paddlenlp\models\vi-layoutxlm-base-uncased\model_state.pdparams [2022/10/20 22:22:51] ppocr INFO: train dataloader has 81 iters [2022/10/20 22:22:51] ppocr INFO: valid dataloader has 9 iters [2022/10/20 22:22:51] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 19 iterations Traceback (most recent call last): File "tools/train.py", line 202, in main(config, device, logger, vdl_writer) File "tools/train.py", line 177, in main eval_class, pre_best_model_dict, logger, vdl_writer, scaler,amp_level, amp_custom_black_list) File "C:\Work\CENATAV\Libraries\PaddleOCR\tools\program.py", line 302, in train loss = loss_class(preds, batch) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(*inputs, **kwargs) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "C:\Work\CENATAV\Libraries\PaddleOCR\ppocr\losses\combined_loss.py", line 58, in forward loss = loss_func(input, batch, **kargs) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in call return self._dygraph_call_func(*inputs, **kwargs) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, **kwargs) File "C:\Work\CENATAV\Libraries\PaddleOCR\ppocr\losses\distillation_loss.py", line 346, in forward loss = super().forward(out, batch) File "C:\Work\CENATAV\Libraries\PaddleOCR\ppocr\losses\vqa_token_layoutlm_loss.py", line 39, in forward [-1, self.num_classes])[active_loss] File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\dygraph\varbase_patch_methods.py", line 736, in getitem return getitem_impl(self, item) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\variable_index.py", line 431, in getitem_impl return get_value_for_bool_tensor(var, slice_item) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\variable_index.py", line 311, in get_value_for_bool_tensor lambda: idx_not_empty(var, item)) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\layers\control_flow.py", line 2466, in cond return false_fn() File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\variable_index.py", line 311, in lambda: idx_not_empty(var, item)) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\variable_index.py", line 300, in idx_not_empty bool_2_idx = where(item == True) File "C:\Users\Ruben.conda\envs\paddleocr\lib\site-packages\paddle\fluid\layers\nn.py", line 14566, in where return _C_ops.where_index(condition) RuntimeError: (PreconditionNotMet) The Tensor's element number must be equal or greater than zero. The Tensor's shape is [-1083617160, 1] now [Hint: Expected numel() >= 0, but received numel():-1083617160 < 0:0.] (at ..\paddle\phi\core\dense_tensor_impl.cc:108) [operator < where_index > error]

Oct 21 '22 03:10 rubensanchezrivero

更新下paddlenlp试下：https://paddleocr.bj.bcebos.com/ppstructure/whl/paddlenlp-2.3.0.dev0-py3-none-any.whl

Oct 21 '22 12:10 littletomatodonkey

the issue persist, it seems to be the number of classes parameter because when i put my number of classes * 2 the error dissapeared

Oct 21 '22 14:10 rubensanchezrivero

the issue persist, it seems to be the number of classes parameter because when i put my number of classes * 2 the error dissapeared

Which parameter? would you like to share the exact location of parameter @rubensanchezrivero

Jun 10 '23 02:06 ariefwijaya

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

Aug 09 '23 09:08 github-actions[bot]

PaddleOCR PaddleOCR copied to clipboard

Cant train KIE

PaddleOCR
PaddleOCR copied to clipboard