Address does not parse correctly
Here's an example of a valid US address which is not parsed correctly by this package:
import json
import usaddress
parsed = usaddress.parse("1509 Via Christina, Vista, CA 92084")
components = {x[1]: x[0] for x in parsed}
print(json.dumps(components, indent=2))
What happens here is that usaddress misinterprets "Vista" as the StreetNamePostType instead of the PlaceName, so we end up with this:
{
"AddressNumber": "1509",
"StreetNamePreType": "Via",
"StreetName": "Christina,",
"StreetNamePostType": "Vista,",
"StateName": "CA",
"ZipCode": "92084"
}
I would obviously expect it to handle addresses like this correctly.
Hi @soapergem, thanks for raising this issue! It seems even with our recent updates, this is still not getting parsed correctly.
Do you happen to have any more examples of addresses like this that confuse the PlaceName for a StreetNamePostType that you can share? If not, then maybe you've seen a pattern with these kinds of addresses? I imagine it's because Vista tends to be part of a street name, but not quite sure.
Unfortunately I don't have any other examples to provide. If you had asked me last year when I posted the comment, I might have been able to find some additional examples as I was working for an employer that was using this package to parse every address in the US. However I've changed jobs and thus no longer have access to the same database I was using at the time.
Interesting find for sure:
>>> [pprint.pp(usaddress.parse(f'1509 via {str}, vista, CA 92084')) for str in ['christ', 'christi', 'christin', 'christina']]
[('1509', 'AddressNumber'),
('via', 'StreetName'),
('christ,', 'StreetName'),
('vista,', 'PlaceName'),
('CA', 'StateName'),
('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
('via', 'StreetName'),
('christi,', 'StreetName'),
('vista,', 'PlaceName'),
('CA', 'StateName'),
('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
('via', 'StreetName'),
('christin,', 'StreetName'),
('vista,', 'PlaceName'),
('CA', 'StateName'),
('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
('via', 'StreetNamePreType'),
('christina,', 'StreetName'),
('vista,', 'StreetNamePostType'),
('CA', 'StateName'),
('92084', 'ZipCode')]
showing essentially once the street becomes christina, vista becomes a StreetNamePostType.