interpolation icon indicating copy to clipboard operation
interpolation copied to clipboard

Demo assumes "U" is abbreviation for unit

Open orangejulius opened this issue 6 years ago • 5 comments

Here's a fun one. It looks like some possibly too naive abbreviation handling is used, at least in the demo. Here, Washington, D.C.'s U street is written incorrectly as "unit street" in the upper left

image

orangejulius avatar Jul 11 '18 04:07 orangejulius

libpostal seems to expand it this way:

./libpostal "U Street Northwest"
unit street northwest
u street northwest

I'm not sure why both variants are not being saved in the database:

sqlite3 /data/interpolation/street.db 'SELECT * FROM names WHERE id = 21452588'
34489668|21452588|unit street northwest

If we were to save both, I'm not sure how we could know to pick the second one for label generation.

The good news is the conflation matching is working great! :) ... so searching for U st returns the correct result.

The bad news is the result is returning the wrong label :(

For Pelias specifically, this issue can be avoided by using the name returned from the layer=street elasticsearch hit for label generation.

missinglink avatar Jul 11 '18 10:07 missinglink

looks like we already do this for Pelias:

source_result.name.default = `${interpolation_result.properties.number} ${source_result.name.default}`;

https://github.com/pelias/api/blob/master/middleware/interpolate.js

missinglink avatar Jul 11 '18 10:07 missinglink

Just found another funny and confusing case of this: apparently libpostal always expands SE to european company.

In the Portland metro area, lots of streets start with Southeast and I was very very confused why every street seemed to be european company...

orangejulius avatar Mar 07 '20 02:03 orangejulius

This is the same issue as https://github.com/pelias/interpolation/issues/234

In Germany I'm seeing this issue as compound street names such as foostraße are being expanded and then can't be found using the original form.

We should revisit this and make sure all versions are indexed and the original form is preserved for display.

missinglink avatar Mar 07 '20 07:03 missinglink

IIRC the original assumption was that since the analysis was symmetrical (ie. libpostal for both indexing and search) that it would be ok, seems that assumption might not be true, or we're not using libpostal at search-time?

missinglink avatar Mar 07 '20 07:03 missinglink