Felix
Felix
> > I'm aware I can get definitions in English of words in other languages. The problem is that the English version of Wiktionary has much fewer German words than...
Im not one of the authors, but as far as I understood Donut only pre-trained on the generated OCR, not the hOCR which would include bounding boxes. Models like UDOP,...
You can run inference on the base-model, which has not been fine tuned to any json schema, to do an OCR prediction just like in the pre-training task. Here is...
In my opinion the strength of donut is not it's ocr generation but the possibility to fine tune on specific tasks. At the moment I can't think of a straight...
It would be very interesting to see how a complicated json structure can impact the model performance, but to make it short: Sure it is possible, you can pretty much...
Donut is not made to compete with OCR engines, it is pre-trained on generating OCR to give it a general understanding about characters and language that can be leveraged in...
Smells like an [xy-problem](https://xyproblem.info/), what exactly are you trying to do? Importing a donut model with the huggingface VisionEncoderDecoder implementation should be straight forward. Just make sure you use the...
Yes it is probably referring to the lighting_module in the root of this repository: [lighting_module.py](https://github.com/clovaai/donut/blob/master/lightning_module.py) Make sure you have that file in your working directory (the same directory you run...
You could think about increasing the input dimensions and forwarding multiple pages as one image, but it does not scale well and no hardware can realistically handle that compute with...
A workaround is to filter out high memory model architectures from the default regressors / classifiers list and to pass that custom list of models to the LazyRegressor / LazyClassifier....