Nominatim
Nominatim copied to clipboard
Query containing ICAO code KAEG returning incorrect results before correct result
Query url:
https://nominatim.openstreetmap.org/search.php?q=kaeg&polygon_geojson=1&viewbox=
Incorrect results listed before correct result:
https://nominatim.openstreetmap.org/details.php?place_id=198797199
https://nominatim.openstreetmap.org/details.php?place_id=4255614
Correct result:
https://nominatim.openstreetmap.org/details.php?place_id=129491536
It seems like it's treating kaeg as a misspelling and returning a "close" match before the actual match.
The geocoder normalizes the ae to a and then searches the place database for kag. The same normalization happens to place names, so it's a race between KAEG (airport), Kaeg (village in Nederlands) and river in Myanmar. (The river's khmer (cambodian) name កិ is normalized to kag.)
ae to a is useful in some european languages, German ä is ae for example, but I see how it hinders for khmer and abbreviations in this case. I'll label it a transliteration bug https://github.com/openstreetmap/Nominatim/labels/Transliteration
I don't know your usecase and if you search in a browser or use the API. You could add a viewbox filter to prefer local (USA) results http://nominatim.org/release-docs/latest/api/Search/ or with &bounded=1 make it a strict filter. Setting the language in the URL (&accept-language=en) doesn't seem to make a difference. Nominatim doesn't have parameters to filter by type of result.
for reference the results with osm_type and osm_id (place_id is not permanent and would change on database import or between servers)
curl -s 'https://nominatim.openstreetmap.org/search.php?q=kaeg&format=jsonv2&accept-language=en' | jq -c '.[] | [.category, .type, .osm_type, .osm_id, .display_name]'
["waterway","river","relation",231955,"Irrawaddy River, Magway District, Magway, 10261, Myanmar"]
["place","village","node",459390549,"Kaag, South Holland, Netherlands, 2159, Netherlands"]
["aeroway","aerodrome","way",225728941,"Double Eagle II Airport, Aerospace Parkway Road Northwest, Albuquerque, Bernalillo County, New Mexico, 87125, USA"]
["aeroway","aerodrome","node",1042056384,"Gangneung Airport, 월호평길, Dusan-dong, Gangneung-si, Gangwon-do, 25548, South Korea"]
["railway","station","node",3074998903,"Kodaganuru, SH 47, Neralagi, Davanagere taluku, Davanagere district, Karnataka, India"]
Thanks for the explanation; I was wondering whether it might have been a language-related issue. I just used the browser search for submitting the issue here. An acquaintance of mine is using Nominatim as a geolocation provider for a weather applet. I'll take a look at the way he's constructing API requests. Thanks again!
This has fixed itself with the new ICU tokenizer.