Is there a support for multiple regex for lines plugin?
I have been using the library for some time to parse my company invoices. I encountered that for my invoices I have line items which can be either of the two format. One way is that I create two templates file for each of it or if there is support for the multiple regex for lines and parser just picks the one for which match has been found.
For many fields, you can add a list of regex in the template and it will try all of them. I'm not sure if the line plugin is implemented in a similar way. If not, just add it and open a pull request.
Right now lines parser supports only one set of rules like:
fields:
lines:
parser: lines
start: Item\s+Discount\s+Price$
end: \s+Total
line: (?P<description>.+)\s+(?P<discount>\d+.\d+)\s+(?P<price>\d+\d+)
Whenever I deal with company that randomly adds and removes some column I make it optional with a ?, e.g.
line: (?P<description>.+)\s+(?P<discount>\d+.\d+)?\s+(?P<price>\d+\d+)
(in above example discount is optional)
As I understand it you're dealing with company that uses one or more completely different layouts for its lines-covered section?
So are you looking for support for something liike
fields:
lines:
parser: lines
rules:
- start: Item\s+Discount\s+Price$
end: \s+Total
line: (?P<description>.+)\s+(?P<discount>\d+.\d+)\s+(?P<price>\d+\d+)
- start: Item\s+Price$
end: \s+Total
line: (?P<description>.+)\s+(?P<price>\d+\d+)
Is that correct?
Implemented in #463