PaddleOCR
PaddleOCR copied to clipboard
finetune SER model 任務中的 max_seq_len
請問 finetune SER model 任務中的 max_seq_len 預設是 512 https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh.yml
我們數據集的 token 長度會超過 512,想調整這個值為 1024
但調整後遇到以下 error
想請教如何解決,謝謝
代码里涉及到expand算子的地方需要跟着改动下
請問具體要修改什麼地方?
我最後找到這個function,但沒找到 error 顯示 value (514) 之處,
再麻煩說明一下,感謝!
我的版本為 paddlepaddle-gpu==2.3.1 paddlenlp==2.5.2
我是分批送进去
paddle的版本建议升级下呢?
@bhhsieh Did you solved the problem? @tran601 How about training model, should we modify the training code to support batches as well?
@tran601 I've tried you code, but that produce an error:
ValueError: (InvalidArgument) The 1th element of 'shape' for expand_v2 op must be greater than 0, but the value given is 0.
[Hint: Expected expand_shape[i] > 0, but received expand_shape[i]:0 <= 0:0.] (at /Users/paddle/xly/workspace/3a60970a-8ff4-461a-878a-a5fbdbe8e3e9/Paddle/paddle/phi/infermeta/unary.cc:1215)
to concat batch : preds = paddle.concat([p1[0],p2[0]],axis=1)