Manu
Manu
I think there is a trick to put YAML into utf8-mode. Not sure we already have that.
Make sure your template only uses UTF-8. Try to open it directly with YAML to debug. Your char may be from another char set than UTF-8 (Latin1?) I believe we...
Good find @RobertLemmens . I believe someone only changed this recently. I'll check why this wasn't covered by tests.
Merged your PR. The Tesseract module is not very frequently used and we don't test it yet. Hoping to see some improvements here over the summer.
When using the library in Python, you need to load the plugin folder. By default it will load the built-in plugins only. The relevant function is `read_templates`, which is passed...
> yes, this pdf only has the PO number. > if I enter any dummy data from the pdf in amount,date and invoice number to satisfy condition, then can I...
I see. That might be a limitation of the current workflow. Tesseract needs an image as input and we do some enhancements to the image before passing it on. To...
The previous commit by @duskybomb was incomplete and didn't fully resolve required fields when loading templates. That was the first issue you are facing. After fixing a few things in...
> Found the following link on https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality @chriswakare improving OCR input is not fully solved. Feel free to experiment and make a PR to improve the command used in `tesseract.py`....
PS: Directly inputting a PDF works now. Small change to the convert-command. Updated my earlier comment.