invoice2data
invoice2data copied to clipboard
Error unhashable type
Description of the problem
I'm having trouble with the package invoice2data
about an error I can't solve.
When I set this template for an invoice :
issuer: My Template
keywords:
- www.webok.com
- 123 4567 89
fields:
amount: TOTAL\s+.(\d+\.\d+)
date: Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
invoice_number: Reference:\s(\w+)
operator: Operators:\s(\w+)
options:
currency: USD
date_formats:
- '%d/%m/%Y %G:%i'
languages:
- en
decimal_separator: '.'
lines:
start: Your Reference:+\s+\w+\n_+
end: \s+_+\n+\s+TOTAL\s+.(\d+\.\d+)
line: (?P<description>.+)\s+\((?P<quantity>.+)\)\s+.(?P<price>\d+\.\d+)
I don't have any error, but, if I add this at the end of fields
I've got the following unhashable type
error :
fields:
...
friendly_name:
parser: static
value: Amazon
unhashable type error :
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/bin/invoice2data", line 11, in <module>
load_entry_point('invoice2data==0.3.5', 'console_scripts', 'invoice2data')()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 201, in main
res = extract_data(f.name, templates=templates, input_module=input_module)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 93, in extract_data
return t.extract(optimized_str)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/extract/invoice_template.py", line 174, in extract
res_find = re.findall(v, optimized_str)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 181, in findall
return _compile(pattern, flags).findall(string)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 237, in _compile
p, loc = _cache[cachekey]
TypeError: unhashable type: 'OrderedDict'
Can anyone help me with this please ? I believe that this is an error when the software tries to unpack the options but don't know how to solve it.
Here's what I have in the debug mode :
...
DEBUG:invoice2data.extract.invoice_template:field=vat_lines | regexp=OrderedDict([('parser', 'lines'), ('start', 'PAYMENT TYPE\\s+AMOUNT\\s+_+'), ('end', '\\s_+\\s+PLEASE KEEP THIS RECEIPT SAFE'), ('line', '(?P<type_paiment>\\w+)\\s+.(?P<montant>\\d+\\.\\d+)'), ('types', OrderedDict([('montant', 'float')]))])
Thanks for your help !
MINIMAL TO REPRODUCE THE ERROR :
script.py
import pprint
from invoice2data import extract_data
from invoice2data.extract.loader import read_templates
templates = read_templates('templates/')
result = extract_data("invoice.pdf", templates=templates)
pprint.pprint(result)
and this as a template (in the template folder)
templates/fr/fr.error.yml
issuer: My Template
keywords:
- www.webok.com
- 123 4567 89
fields:
amount: TOTAL\s+.(\d+\.\d+)
date: Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
invoice_number: Reference:\s(\w+)
operator: Operators:\s(\w+)
vat_lines:
parser: lines
start: PAYMENT TYPE\s+AMOUNT\s+_+
end: \s_+\s+PLEASE KEEP THIS RECEIPT SAFE
line: (?P<type_paiment>\w+)\s+.(?P<montant>\d+\.\d+)
types:
montant: float
options:
currency: USD
date_formats:
- '%d/%m/%Y %G:%i'
languages:
- en
decimal_separator: '.'
lines:
start: Your Reference:+\s+\w+\n_+
end: \s+_+\n+\s+TOTAL\s+.(\d+\.\d+)
line: (?P<description>.+)\s+\((?P<quantity>.+)\)\s+.(?P<price>\d+\.\d+)
invoice.txt
(needs to be converted in .pdf)
__________________________________________
My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product (1) €12.93
Second product (3) €22.93
Third product (1) €12.95
Last product (1) €12.93
_________
TOTAL €61.74
VAT/CODE NET VAT
_____________________________
20% S €93.27 €18.66
PAYMENT TYPE AMOUNT
_____________________________
CASH €61.74
CARD €0.00
CHANGE GIVEN €3.07
__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________
Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com
Debug output
And finally, the full output of the invoice2data
debug output :
DEBUG:invoice2data.main:START pdftotext result ===========================
DEBUG:invoice2data.main:__________________________________________
My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product (1) €12.93
Second product (3) €22.93
Third product (1) €12.95
Last product (1) €12.93
_________
TOTAL €61.74
VAT/CODE NET VAT
_____________________________
20% S €93.27 €18.66
PAYMENT TYPE AMOUNT
_____________________________
CASH €61.74
CARD €0.00
CHANGE GIVEN €3.07
__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________
Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com
DEBUG:invoice2data.main:END pdftotext result =============================
DEBUG:invoice2data.main:Testing 254 template files
DEBUG:invoice2data.extract.invoice_template:Matched template fr.error.yml
DEBUG:invoice2data.extract.invoice_template:START optimized_str ========================
DEBUG:invoice2data.extract.invoice_template:__________________________________________
My Template
__________________________________________
Date: 03/12/2020 11:23
Operators: Me
Reference: ABC123
__________________________________________
First product (1) €12.93
Second product (3) €22.93
Third product (1) €12.95
Last product (1) €12.93
_________
TOTAL €61.74
VAT/CODE NET VAT
_____________________________
20% S €93.27 €18.66
PAYMENT TYPE AMOUNT
_____________________________
CASH €61.74
CARD €0.00
CHANGE GIVEN €3.07
__________________________________________
PLEASE KEEP THIS RECEIPT SAFE
FOR GUARANTEE PURPOSES
__________________________________________
Thanks for shopping with us!
VAT Number : 123 4567 89
www.webok.com
DEBUG:invoice2data.extract.invoice_template:END optimized_str ==========================
DEBUG:invoice2data.extract.invoice_template:Date parsing: languages=['en'] date_formats=['%d/%m/%Y %G:%i']
DEBUG:invoice2data.extract.invoice_template:Float parsing: decimal separator=.
DEBUG:invoice2data.extract.invoice_template:keywords=['www.webok.com', '123 4567 89']
DEBUG:invoice2data.extract.invoice_template:{'date_formats': ['%d/%m/%Y %G:%i'], 'lowercase': False, 'decimal_separator': '.', 'currency': 'USD', 'replace': [], 'languages': ['en'], 'remove_whitespace': False, 'remove_accents': False}
DEBUG:invoice2data.extract.invoice_template:field=amount | regexp=TOTAL\s+.(\d+\.\d+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'61.74']
DEBUG:invoice2data.extract.invoice_template:field=date | regexp=Date:\s+(\d{1,2}\/\d{1,2}\/\d{4}\s+\d{1,2}:\d{1,2})
DEBUG:invoice2data.extract.invoice_template:res_find=[u'03/12/2020 11:23']
DEBUG:invoice2data.extract.invoice_template:result of date parsing=2020-03-12 11:23:00
DEBUG:invoice2data.extract.invoice_template:field=invoice_number | regexp=Reference:\s(\w+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'ABC123']
DEBUG:invoice2data.extract.invoice_template:field=operator | regexp=Operators:\s(\w+)
DEBUG:invoice2data.extract.invoice_template:res_find=[u'Me']
DEBUG:invoice2data.extract.invoice_template:field=vat_lines | regexp=OrderedDict([('parser', 'lines'), ('start', 'PAYMENT TYPE\\s+AMOUNT\\s+_+'), ('end', '\\s_+\\s+PLEASE KEEP THIS RECEIPT SAFE'), ('line', '(?P<type_paiment>\\w+)\\s+.(?P<montant>\\d+\\.\\d+)'), ('types', OrderedDict([('montant', 'float')]))])
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/bin/invoice2data", line 11, in <module>
load_entry_point('invoice2data==0.3.5', 'console_scripts', 'invoice2data')()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 201, in main
res = extract_data(f.name, templates=templates, input_module=input_module)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/main.py", line 93, in extract_data
return t.extract(optimized_str)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/invoice2data/extract/invoice_template.py", line 174, in extract
res_find = re.findall(v, optimized_str)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 181, in findall
return _compile(pattern, flags).findall(string)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 237, in _compile
p, loc = _cache[cachekey]
TypeError: unhashable type: 'OrderedDict'
I am struggling with the same problem. Every time I try to write new rules in this form:
fields:
dok_type:
parser: static
value: Amazon
I get the same error message. The same happens when I rewrite existing rules, like
ordernr: 'Your order number\s+(\d{9})'
in the format
ordernr:
parser: regex
regex: 'Your order number\s+(\d{9})'
type: int
Each time the message "TypeError: unhashable type: 'OrderedDict'" appears.
From TUTORIAL.md, "Each field can be defined as:
- an associative array with
parser
, specifying parsing method ..."
Solution
Therefore, the following syntax works for me:
fields:
total: {
parser: regex,
regex: 'Total.*\$(\d+\.?\d+)',
type: float
}
From TUTORIAL.md, "Each field can be defined as:
- an an associative array with
parser
, specifying parsing method ..."Solution
Therefore, the following syntax works for me:
fields: total: { parser: regex, regex: 'Total.*\$(\d+\.?\d+)', type: float }
doesnt work for me same problem
@RossK1 @m3nu are u guys can help us?
@gitaddgitpush I followed the steps to reproduce the error but couldn't. Can you give me more information about your working environment, such as os, python version?
Here is the output I got:
{'amount': 61.74,
'currency': 'USD',
'date': datetime.datetime(2020, 3, 12, 11, 23),
'desc': 'Invoice from My Template',
'friendly_name': 'Amazon',
'invoice_number': 'ABC123',
'issuer': 'My Template',
'operator': 'Me',
'vat_lines': [{'montant': 61.74, 'type_paiment': 'CASH'},
{'montant': 0.0, 'type_paiment': 'CARD'},
{'montant': 3.07, 'type_paiment': 'GIVEN'}]}
You need release 0.3.6 or newer for the fields:
syntax support in YAML. Can you re-test with 0.3.6, please?
I used template and invoice content provided by @gitaddgitpush in the first comment. It got parsed without any error as:
[
{
"issuer": "My Template",
"amount": 61.74,
"date": "2020-03-12",
"invoice_number": "ABC123",
"operator": "Me",
"vat_lines": [
{
"type_paiment": "CASH",
"montant": 61.74
},
{
"type_paiment": "CARD",
"montant": 0.0
},
{
"type_paiment": "GIVEN",
"montant": 3.07
}
],
"currency": "USD",
"lines": [],
"desc": "Invoice from My Template"
}
]