api icon indicating copy to clipboard operation
api copied to clipboard

WIP: skip street parse when its also the subject

Open orangejulius opened this issue 4 years ago • 3 comments

In past work (https://github.com/pelias/api/pull/1469 and https://github.com/pelias/api/pull/1468), we've discovered that searching across both the name and address fields can be problematic. Records that are a poor name match but also match the address fields can easily outscore a record with an excellent name match.

This PR explores another general variant of that situation: times when the Pelias Parser returns the same value as both the subject parse and the street parse. By removing the street parse in this case, it looks like there are at least several cases where better POI results are returned.

Here's two examples from the acceptance tests, but more investigation into the effects is needed: image

orangejulius avatar Feb 04 '21 17:02 orangejulius

Sounds like a reasonable change, it's always a tradeoff, this approach will of course eliminate some legitimate street results, such as street: Wrigley Field or street: Union Square in other regions not associated with those popular venues.

Having said that I think it's still a good change.

Could you please put in some code comments, I had to read it twice to see what's going on even with a 10 line diff ;)

missinglink avatar Feb 04 '21 23:02 missinglink

Sure, will definitely add comments and tests for this code. Just wanted to float the idea early to see what everyone thought.

orangejulius avatar Feb 05 '21 00:02 orangejulius

One other thought I had was that it might introduce jitter, although on reflection it's likely not an issue since the parser likely detects place and street classifications on the same keypress.

You might have to move the code closer to where the solutions are returned in order to detect it: https://github.com/pelias/api/blob/master/sanitizer/_text_pelias_parser.js

I don't have my head fully wrapped around this issue but it might be more powerful to look at all the solutions returned from the parser in the t.solution Array and make changes there.

missinglink avatar Feb 05 '21 01:02 missinglink