parser
parser copied to clipboard
natural language classification engine for geocoding
Today we are merging https://github.com/pelias/api/pull/1565 which brings a bunch of `pelias/parser` changes into `pelias/api`. As part of this process we did some wider acceptance test checks and diff'd them against...
A nice complex testcase from Berlin: > Onion Space, ExRotaprint, Gottschedstraße 4, Aufgang 4, 1. OG rechts, 13357 Berlin
The suffix 'Lange' (= long) should be treated just like 'Korte' (= short), but is currently also classified as an Area/Locality: - Lange Koestraat, Utrecht - Lange Hezelstraat, Nijmegen -...
Regex works properly. However, it's further upstream the space in between breaks the postal code into 2 sections. Also happens for NL postcodes. Examples: - SW11 6NU - BA14 7LY...
Typically, addresses in the Netherlands have 4 digits, followed by 0 or 1 space, followed by 2 alphanumeric characters, e.g. "7512EC" or "7512 EC". The alphanumeric characters should not be...
I attempted to update the WOF resources today but lost enthusiasm half way through due to a bunch of different changes popping up. This PR makes subsequent attempts at updating...
### Use-cases I want select specif country format for increase solution quality. ### Attempted Solutions for example some country have house number before name.. others.. after name ### Proposal use...
This PR changes to how ampersands are handled with **a preference for venues over intersections** in some cases. The work is motivated by the parser having poor support for things...
There are some streets near https://www.openstreetmap.org/way/17414828#map=16/41.6311/-85.4226 that have fairly difficult names to parse such as: - South 00 Ew - West South Street (we actually parse this one already) -...
This library has the concepts of `word`, `phrase` and `section` I not sure if the `word` concept is required as it can be represented as a single token `phrase`. In...