Nominatim icon indicating copy to clipboard operation
Nominatim copied to clipboard

Query containing ICAO code KAEG returning incorrect results before correct result

Open m0unds opened this issue 6 years ago • 3 comments

Query url: https://nominatim.openstreetmap.org/search.php?q=kaeg&polygon_geojson=1&viewbox=

Incorrect results listed before correct result: https://nominatim.openstreetmap.org/details.php?place_id=198797199 https://nominatim.openstreetmap.org/details.php?place_id=4255614

Correct result: https://nominatim.openstreetmap.org/details.php?place_id=129491536

It seems like it's treating kaeg as a misspelling and returning a "close" match before the actual match.

m0unds avatar May 15 '19 15:05 m0unds

The geocoder normalizes the ae to a and then searches the place database for kag. The same normalization happens to place names, so it's a race between KAEG (airport), Kaeg (village in Nederlands) and river in Myanmar. (The river's khmer (cambodian) name កិ is normalized to kag.)

ae to a is useful in some european languages, German ä is ae for example, but I see how it hinders for khmer and abbreviations in this case. I'll label it a transliteration bug https://github.com/openstreetmap/Nominatim/labels/Transliteration

I don't know your usecase and if you search in a browser or use the API. You could add a viewbox filter to prefer local (USA) results http://nominatim.org/release-docs/latest/api/Search/ or with &bounded=1 make it a strict filter. Setting the language in the URL (&accept-language=en) doesn't seem to make a difference. Nominatim doesn't have parameters to filter by type of result.

mtmail avatar May 15 '19 18:05 mtmail

for reference the results with osm_type and osm_id (place_id is not permanent and would change on database import or between servers)

curl -s 'https://nominatim.openstreetmap.org/search.php?q=kaeg&format=jsonv2&accept-language=en' | jq -c '.[] | [.category, .type, .osm_type, .osm_id, .display_name]'
["waterway","river","relation",231955,"Irrawaddy River, Magway District, Magway, 10261, Myanmar"]
["place","village","node",459390549,"Kaag, South Holland, Netherlands, 2159, Netherlands"]
["aeroway","aerodrome","way",225728941,"Double Eagle II Airport, Aerospace Parkway Road Northwest, Albuquerque, Bernalillo County, New Mexico, 87125, USA"]
["aeroway","aerodrome","node",1042056384,"Gangneung Airport, 월호평길, Dusan-dong, Gangneung-si, Gangwon-do, 25548, South Korea"]
["railway","station","node",3074998903,"Kodaganuru, SH 47, Neralagi, Davanagere taluku, Davanagere district, Karnataka, India"]

mtmail avatar May 15 '19 18:05 mtmail

Thanks for the explanation; I was wondering whether it might have been a language-related issue. I just used the browser search for submitting the issue here. An acquaintance of mine is using Nominatim as a geolocation provider for a weather applet. I'll take a look at the way he's constructing API requests. Thanks again!

m0unds avatar May 15 '19 19:05 m0unds

This has fixed itself with the new ICU tokenizer.

lonvia avatar Nov 16 '22 16:11 lonvia