donut
donut copied to clipboard
failed to predict by the model generated by Classification FineTune
I finetuned the base model on the rvlcdip dataset as below.
!python train.py
--config config/train_rvlcdip.yaml
--pretrained_model_name_or_path "naver-clova-ix/donut-base"
--dataset_name_or_paths '["dataset/rvlcdip"]'
--exp_version "test_rvlcdip"
And using the trained model and inference with task_prompt = "<s_rvlcdip>", the output result is as follows. pretrained_model.inference(image=input_img, prompt=task_prompt)["predictions"][0] outputs -> {'text_sequence': ''} I was expecting output like below. {'class': 'invoice'}
Do you know what is wrong?