invoice2data icon indicating copy to clipboard operation
invoice2data copied to clipboard

Facing problem working with pdf bank statements ,still not able to resolve.

Open pritisatish opened this issue 4 years ago • 0 comments

Getting some errors,tried to modify templates to make bank statements work but could not able to capture required fields.

[InvoiceTemplate([('issuer', 'Amazon Web Services'), ('fields', OrderedDict([('amount', 'TOTAL AMOUNT DUE ON.\$(\d+\.\d+)'), ('amount_untaxed', 'TOTAL AMOUNT DUE ON.\$(\d+\.\d+)'), ('date', 'Invoice Date:\s+([a-zA-Z]+ \d+ , \d+)'), ('invoice_number', 'Invoice Number:\s+(\d+)'), ('partner_name', '(Amazon Web Services, Inc\.)'), ('static_partner_website', 'aws.amazon.com')])), ('keywords', ['Amazon Web Services', '$', 'Invoice']), ('lines', OrderedDict([('start', 'Detail'), ('end', '\* May include estimated US sales tax'), ('first_line', '^ (?P\w+.)\$(?P<price_unit>\d+\.\d+)'), ('line', '(.)\$(\d+\.\d+)'), ('last_line', 'VAT \\')])), ('options', OrderedDict([('currency', 'USD'), ('date_formats', ['%B %d, %Y']), ('languages', ['en']), ('decimal_separator', '.')])), ('template_name', 'AmazonWebServices.yml')]), InvoiceTemplate([('issuer', 'Your account statement'), ('fields', OrderedDict([('amount', 'TOTAL AMOUNT DUE ON.*\$(\d+\.\d+)'), ('date', 'Issue date:\s+([a-zA-Z]+ \d+ , \d+)'), ('Account number', 'Account number:\s+(\d+)'), ('bank_website', 'www.bankofscotland.co.uk')])), ('keywords', ['Bank of Scotland plc', '$', 'Bank Statement']), ('options', OrderedDict([('currency', 'USD'), ('date_formats', ['%B %d, %Y']), ('languages', ['en']), ('decimal_separator', '.')])), ('template_name', 'BankofScotlandBusiness.yml')]), InvoiceTemplate([('issuer', 'Howard Bank United States'), ('fields', OrderedDict([('amount', 'Ending Balance \$(\d+,\d+.\d{2})'), ('date', 'Statement Ending \s+([a-zA-Z]+ \d+ , \d+)'), ('invoice_number', 'HOWARD RELATIONSHIP CHECKING\s+[0-9a-zA-Z]{12}')])), ('keywords', ['HowardBank.com']), ('exclude_keywords', ['Invoice']), ('options', OrderedDict([('remove_whitespace', False), ('currency', 'USD'), ('date_formats', ['%m/%d/%Y']), ('languages', ['en'])])), ('template_name', 'Howard_Bank.yml')]), InvoiceTemplate([('issuer', 'QualityHosting AG'), ('fields', OrderedDict([('amount', 'Total EUR\s+(\d+,\d+)'), ('amount_untaxed', 'Total EUR\s+(\d+,\d+)'), ('date', ['\s{2,}(\d+\. .+ \d{4})\s{2,}', 'Rechnungsdatum\s+(\w+ \d+, \d{4})']), ('invoice_number', 'Rechnungsnr\.\s+(\d{8})'), ('vat', 'DE 232 446 240')])), ('lines', OrderedDict([('start', 'Contract No. \w+'), ('end', 'Total EUR'), ('first_line', '\s+(?P\d+)\s+(?P\d+)\s+(?P.{,70})\s+(?P\d+,\d+)'), ('line', '^\s+(?P.+)$'), ('types', OrderedDict([('qty', 'float'), ('price', 'float')]))])), ('keywords', ['QualityHosting']), ('options', OrderedDict([('currency', 'EUR'), ('decimal_separator', ',')])), ('template_name', 'QualityHosting_test.yml')])] {'issuer': 'Amazon Web Services', 'amount': 4.11, 'amount_untaxed': 4.11, 'date': datetime.datetime(2014, 8, 3, 0, 0), 'invoice_number': '42183017', 'partner_name': 'Amazon Web Services, Inc.', 'partner_website': 'aws.amazon.com', 'currency': 'USD', 'lines': [{'description': 'AWS Data Transfer', 'price_unit': '0.01'}, {'description': 'Amazon Elastic Compute Cloud', 'price_unit': '1.87'}, {'description': 'Amazon Glacier', 'price_unit': '2.22'}, {'description': 'Amazon Simple Storage Service', 'price_unit': '0.01'}], 'desc': 'Invoice from Amazon Web Services'}

regexp for field amount didn't match regexp for field date didn't match Unable to match all required fields. The required fields are: ['date', 'amount', 'invoice_number', 'issuer']. Output contains the following fields: ['currency', 'invoice_number', 'issuer']. None

Process finished with exit code 0

pritisatish avatar Jun 28 '21 11:06 pritisatish