invoice2data icon indicating copy to clipboard operation
invoice2data copied to clipboard

Whether invoice2data and pdftotext modules works to extract text and parse bank statement pdf files?

Open pritisatish opened this issue 4 years ago • 3 comments

Hi,i want to work using bank statement pdf files, but i am getting error No template for C:/Users/Guest/PycharmProjects/MacineLearning/invoice2data/invoice/Howard_Bank.pdf

i tried using invoice2data and pdftotext python modules ,for invoice pdf files i am able to capture required fields uisng yaml files with regexp but that same template stucture not working for pdf bank statements.May be it works only for invoice pdf files,then how to extract text from pdf bank statements and capture required fields. I appreciate any help

pritisatish avatar Jun 22 '21 07:06 pritisatish

You can probably use the line plugin or custom fields to make it work for bank statements. It's not limited to invoices only.

m3nu avatar Jun 22 '21 07:06 m3nu

You can probably use the line plugin or custom fields to make it work for bank statements. It's not limited to invoices only.

Thanks a lot ,will check it out , as i am new to machine learning ,it would be much helpful if you provide any examples or links related to this plugins

pritisatish avatar Jun 22 '21 08:06 pritisatish

You can probably use the line plugin or custom fields to make it work for bank statements. It's not limited to invoices only.

where to use this line plugin ,whether in template folders where yaml files are placed?

pritisatish avatar Jun 25 '21 06:06 pritisatish

No template for C:/Users/Guest/PycharmProjects/MacineLearning/invoice2data/invoice/Howard_Bank.pdf

i tried using invoice2data and pdftotext python modules ,for invoice pdf files i am able to capture required fields uisng yaml files with regexp but that same template stucture not working for pdf bank statements.

It means there is some relevant difference between your bank invoices and statements.

Please double check your YAML template. Every template must have keywords specified. It seems that your bank statements don't include all keywords you specified in your YAML template. Please double check that.

rmilecki avatar Jan 22 '23 20:01 rmilecki

where to use this line plugin ,whether in template folders where yaml files are placed?

lines plugin / parser can you used for any field in your YAML template.

For some minimal example see https://github.com/invoice-x/invoice2data/blob/master/TUTORIAL.md#parser-lines

For more examples check project internal templates https://github.com/invoice-x/invoice2data/tree/master/src/invoice2data/extract/templates

rmilecki avatar Jan 22 '23 20:01 rmilecki