invoice2data icon indicating copy to clipboard operation
invoice2data copied to clipboard

Regex matching second field instead of first

Open coko7 opened this issue 4 weeks ago • 0 comments

Hi, First of, I want to say I am loving this project and have been having a lot of fun playing with it for the last few days. Now, I have been making my own templates and this has been going well for the most part but I have started encountering issues that I can't fully wrap my head around. Here is one.

I have a PDF invoice which eventually contains the following section (after putting it through pdftotext)

TVA (20%%)
Total TTC

28,56 €
171,36 €

I have the following template field:

fields:
  amount: Total TTC\s+([\d,]+)\s€

When i try with:

invoice2data "my_invoice.pdf" \
    --debug \
    --input-reader pdftotext \
    --template-folder "my_templates" --exclude-built-in-templates \
    --output-format json --output-name "output.json"

I get this:

DEBUG:invoice2data.extract.parsers.regex: field=amount | regex=Total TTC\s+([\d,]+)\s. | matches=['171,36']

Can you explain why the second field is being matched instead of the first? And, what regex should I be writing if I wanted to store both values respectively in amount_taxes and amount fields? Thanks!

coko7 avatar Dec 16 '25 06:12 coko7