usaddress icon indicating copy to clipboard operation
usaddress copied to clipboard

Address does not parse correctly

Open soapergem opened this issue 1 year ago • 3 comments

Here's an example of a valid US address which is not parsed correctly by this package:

import json
import usaddress

parsed = usaddress.parse("1509 Via Christina, Vista, CA 92084")
components = {x[1]: x[0] for x in parsed}
print(json.dumps(components, indent=2))

What happens here is that usaddress misinterprets "Vista" as the StreetNamePostType instead of the PlaceName, so we end up with this:

{
  "AddressNumber": "1509",
  "StreetNamePreType": "Via",
  "StreetName": "Christina,",
  "StreetNamePostType": "Vista,",
  "StateName": "CA",
  "ZipCode": "92084"
}

I would obviously expect it to handle addresses like this correctly.

soapergem avatar May 24 '24 20:05 soapergem

Hi @soapergem, thanks for raising this issue! It seems even with our recent updates, this is still not getting parsed correctly.

Do you happen to have any more examples of addresses like this that confuse the PlaceName for a StreetNamePostType that you can share? If not, then maybe you've seen a pattern with these kinds of addresses? I imagine it's because Vista tends to be part of a street name, but not quite sure.

xmedr avatar Jun 12 '25 13:06 xmedr

Unfortunately I don't have any other examples to provide. If you had asked me last year when I posted the comment, I might have been able to find some additional examples as I was working for an employer that was using this package to parse every address in the US. However I've changed jobs and thus no longer have access to the same database I was using at the time.

soapergem avatar Jun 14 '25 02:06 soapergem

Interesting find for sure:

>>> [pprint.pp(usaddress.parse(f'1509 via {str}, vista, CA 92084')) for str in ['christ', 'christi', 'christin', 'christina']]
[('1509', 'AddressNumber'),
 ('via', 'StreetName'),
 ('christ,', 'StreetName'),
 ('vista,', 'PlaceName'),
 ('CA', 'StateName'),
 ('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
 ('via', 'StreetName'),
 ('christi,', 'StreetName'),
 ('vista,', 'PlaceName'),
 ('CA', 'StateName'),
 ('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
 ('via', 'StreetName'),
 ('christin,', 'StreetName'),
 ('vista,', 'PlaceName'),
 ('CA', 'StateName'),
 ('92084', 'ZipCode')]
[('1509', 'AddressNumber'),
 ('via', 'StreetNamePreType'),
 ('christina,', 'StreetName'),
 ('vista,', 'StreetNamePostType'),
 ('CA', 'StateName'),
 ('92084', 'ZipCode')]

showing essentially once the street becomes christina, vista becomes a StreetNamePostType.

aavilla-riparian avatar Aug 26 '25 21:08 aavilla-riparian