donut
donut copied to clipboard
how to finetune a model with a downsteam task that is same with the pre-train task?
I want to finetune a model based on "naver-clova-ix/donut-base" with a downstream task that is different from those three tasks mentioned in the paper (Document Classification, Document Information Extraction, and Document Visual Question Answering), but same as the Pre-training task, which is to say, I want to teach the model to learn "how to read" better. In that task, I will input an image to the donut-base, expecting the model to output all the text in the image and calculate the loss against the pre-prepared correct text, and then use the loss for backpropagation. The question is, how should I modify the released code or configuration file? Thank you!
I also have the same need as you, have you solved it now?
@moonbings @SamSamhuns @eltociear , any of you folks have a solution for it?
I think it would be helpful to include the train_pretrain.yaml
in the config file folder!
isn't the same with finetuning on information extraction considering ground truth label all the tokens on the document you want your model learn to read ?
it is just pseudo reading task as describe in the doc:
For (Pseudo) Text Reading Task The gt_parse looks like {"text_sequence" : "word1 word2 word3 ... "}