usaddress icon indicating copy to clipboard operation
usaddress copied to clipboard

Question about dual address with PO Box

Open altmank opened this issue 6 years ago • 1 comments

Hello,

I'm trying to understand the parsing behavior in a scenario with a specific address: "741 N Main St Po Box 246, Cedarville, CA 96104".

This is an address string that contains <Line1> <Line2>, <City>, <State> <Zip> and <Line2> is a PO Box.

Here is the output: address

I've highlighted the parts of the output that seem incorrect.

I entered the freeform address on https://smartystreets.com/ and it calls it a "Dual Address".

Is this output correct? If not, what is the correct output you'd expect based on your knowledge of the USPS specifications? Is having a dual address a valid scenario for this parser to process?

altmank avatar Sep 06 '18 19:09 altmank

Hi there thanks for posting this example.

  • Is this output correct? As an American familiar with addresses in the United States, I'd say no. If I were to manually break this address into fields, this is not what I would do.

  • What is the correct output you'd expect... Thats a tough one, probably would put PO BOX as the Secondary Unit and 245 as the Secondary Number

  • ...based on your knowledge of the USPS specifications? I have no such knowledge.

  • Is having a dual address a valid scenario for this parser to process? It is a bit unusual, but people put all kinds of crazy things into addresses, and the post office can generally handle that because in the end they will have a human being figure it out.

This library attempts to make a best effort to parse text that looks like addresses. (See the "What is this for" section of the readme.) It may be possible to support this falrly easily. If you look at RangedUnits:

https://github.com/jamesrcounts/usaddress/blob/70e271801c0bfe9468add46fd08c74de4b0cddbe/src/AddressParser/AddressParser.cs#L257

You can see that Box is already included in the list, which is why Box is showing up as your SecondaryUnit. If you left Box in place and added "PO\sBox" then your address might parse correctly.

Want to add a unit test with your example, try the update and submit a PR?

jamesrcounts avatar Sep 07 '18 03:09 jamesrcounts