What is the prompt for the pseudo reading task?
In the Readme it says: "For (Pseudo) Text Reading Task
The gt_parse looks like {"text_sequence" : "word1 word2 word3 ... "}
This task is also a pre-training task of Donut model.
You can use our SynthDoG 🐶 to generate synthetic images for the text reading task with proper gt_parse. See ./synthdog/README.md for details.
"
I saw on some guy Phiipp's blog he just used "" where as here: https://towardsdatascience.com/ocr-free-document-understanding-with-donut-1acfbdf099be
he or she uses "<s_sroie_donut>" ie. s followed by the file path of the dataset he or she used to fine tune donut.
Is this correct, to just use ""?
I am trying to read my students' homework submissions.
As an example, I am trying to read this in Colab:
It's actually very weak when I tried with the prompt "" in this notebook: https://gist.github.com/nyck33/ba445efc3f480a3be5b8a1bc8ffb3418
I'm not sure why in the code cell the prompt I used looks like "" empty string but you can see that I printed it out at the bottom to confirm it was "".