Manu comments

Results 650 comments of


                                            Manu

ERROR:invoice2data.main:No template for FlipkartInvoice.pdf

I think there is a trick to put YAML into utf8-mode. Not sure we already have that.

ERROR:invoice2data.main:No template for FlipkartInvoice.pdf

Make sure your template only uses UTF-8. Try to open it directly with YAML to debug. Your char may be from another char set than UTF-8 (Latin1?) I believe we...

Add testing for Tesseract module. Including test PDFs or images

Good find @RobertLemmens . I believe someone only changed this recently. I'll check why this wasn't covered by tests.

Add testing for Tesseract module. Including test PDFs or images

Merged your PR. The Tesseract module is not very frequently used and we don't test it yet. Hoping to see some improvements here over the summer.

Process OCR document with custom fields

When using the library in Python, you need to load the plugin folder. By default it will load the built-in plugins only. The relevant function is `read_templates`, which is passed...

Process OCR document with custom fields

> yes, this pdf only has the PO number. > if I enter any dummy data from the pdf in amount,date and invoice number to satisfy condition, then can I...

Process OCR document with custom fields

I see. That might be a limitation of the current workflow. Tesseract needs an image as input and we do some enhancements to the image before passing it on. To...

Process OCR document with custom fields

The previous commit by @duskybomb was incomplete and didn't fully resolve required fields when loading templates. That was the first issue you are facing. After fixing a few things in...

Process OCR document with custom fields

> Found the following link on https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality @chriswakare improving OCR input is not fully solved. Feel free to experiment and make a PR to improve the command used in `tesseract.py`....

Process OCR document with custom fields

PS: Directly inputting a PDF works now. Small change to the convert-command. Updated my earlier comment.