donut
donut copied to clipboard
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Hi for text reading task it instructs that: `You can use our SynthDoG 🐶 to generate synthetic images for the text reading task with proper gt_parse. See ./synthdog/README.md for details.`...
Hi, @gwkrsrch , It works well in the case of DONUT-base, but DONUT-proto does not. Could you please provide the finetuning YAML configuration file of DONUT-proto? Many thanks for your...
Hi, I am running the Donut to pre-train on my custom data. However, when I scaled up the data size (2M images~), I got this error. (But, I verify the...
Thanks for publishing this interesting work. Would I be able to extend the Document Understanding task to learn hierarchies over paragraphs of text within a page? Or is the 512...
It would be great if donut has ability to extract the bounding boxes of each entity extracted. The bounding box information is important and useful for visualizing and down stream...
Hi, I tried fine tuning the model with custom receipt dataset for IE task and noticed issues with the output text extracted for given set of keys. It either misses...
Hi, @gwkrsrch , DONUT is an excellent work for VDU community! We can reproduce the tree-based edit-distance results on the CORD test set. But it is tricky to calculate the...
@gwkrsrch I have tried to run the inference script on cpu, the cpu inference time is very high as compared to gpu inference time.Can you fix this issue?
I'm trying to retrain donut model on my custom dataset, and made the dataset using script. I put metadata.jsonl file for all the images along with images in train folder....
I'm training Document Information Extraction for custom Dataset of 100 train, 20 validation images. This is the config that I gave: ``` resume_from_checkpoint_path: null result_path: "./result" pretrained_model_name_or_path: "naver-clova-ix/donut-base" dataset_name_or_paths: ["/content/drive/MyDrive/donut_1.1"]...