parser
parser copied to clipboard
venue work: changes to how ampersands are parsed
This PR changes to how ampersands are handled with a preference for venues over intersections in some cases.
The work is motivated by the parser having poor support for things like "Bar & Grill" (and venue names containing ampersands in general)
the diff looks like more changes than it really is 😄
the main difference is in parser/AddressParser.js
- change the order which the "phrase classifiers" are listed to put "PlaceClassifier" above "IntersectionClassifier".
this allows us to change the behaviour of "IntersectionClassifier" to allow this exception:
- do not classify 'and' sandwiched by two 'PlaceClassification' as an 'IntersectionClassification' (eg. 'Bar & Restaurant')
making this change means that it's possible to see odd classifications such as '& grill' as a street, to resolve this I've added not: 'PunctuationClassification'
in many places to differentiate from an AlphaClassification
.
still a DRAFT PR for now, needs more testing before opening up for merging.
one of the issues with this method is that there will be 'jitter' for partially complete inputs, eg:
'foo & bar'
(0.80) ➜ [ { venue: 'foo & bar' } ]
(0.70) ➜ [ { street: 'foo' }, { street: 'bar' } ]
'foo & ba'
(0.68) ➜ [ { street: 'foo' }, { street: 'ba' } ]
... although this might not be an issue in pelias/api
, depending on how it's converted to an ES query in https://github.com/pelias/api/pull/1487
[edit] https://github.com/pelias/api/pull/1487/commits/50c15db0d971cb46813711295e5973929950a940 shows it's not an issue 🎉