Nominatim icon indicating copy to clipboard operation
Nominatim copied to clipboard

Wrong results for "независимости 100" (housenumber not found with common street names)

Open Komzpa opened this issue 4 years ago • 5 comments

"независимости 100" https://www.openstreetmap.org/search?query=%D0%BD%D0%B5%D0%B7%D0%B0%D0%B2%D0%B8%D1%81%D0%B8%D0%BC%D0%BE%D1%81%D1%82%D0%B8%20100

Expected: https://www.openstreetmap.org/way/25427789 (building on проспект Независимости with housenumber 100) Actual: https://www.openstreetmap.org/way/228413707 (street named Независимости, not building)

"проспект независимости 100" https://www.openstreetmap.org/search?query=%D0%BF%D1%80%D0%BE%D1%81%D0%BF%D0%B5%D0%BA%D1%82%20%D0%BD%D0%B5%D0%B7%D0%B0%D0%B2%D0%B8%D1%81%D0%B8%D0%BC%D0%BE%D1%81%D1%82%D0%B8%20100

Expected: https://www.openstreetmap.org/way/25427789 (building on проспект Независимости with housenumber 100) Actual: https://www.openstreetmap.org/way/33911444 (pieces of street "проспект Независимости")

Komzpa avatar May 09 '20 09:05 Komzpa

Thanks for the specific example. We'll check.

'street + housenumber' without a city name (Минск (Minsk) or country name (Беларусь (Belarus)) is always hard on a global level. Adding one of them anywhere would help

mtmail avatar May 09 '20 10:05 mtmail

It becomes worse.

"минск независимости 100" https://www.openstreetmap.org/search?query=%D0%BC%D0%B8%D0%BD%D1%81%D0%BA%20%D0%BD%D0%B5%D0%B7%D0%B0%D0%B2%D0%B8%D1%81%D0%B8%D0%BC%D0%BE%D1%81%D1%82%D0%B8%20100

Expected: https://www.openstreetmap.org/way/25427789 (building on проспект Независимости with housenumber 100) Actual: https://www.openstreetmap.org/node/281570459 (hotel Минск on проспект Независимости)

"беларусь независимости 100" works as expected, but it is not ever expected to be written this way.

Komzpa avatar May 09 '20 13:05 Komzpa

You have to give the whole street name to get to the exact match. 'проспект Независимости 100 Минск' finds the address. You've run into two combined issues here.

First there is the problem that countries have different ways of abbreviating streets when referring to addresses. Russian seems to like to leave out denominators like 'проспект'. This is something you could not do, for example, in German. So we can't just add a catch-all to Nominatim, it would make results eventually a lot worse. We need to know accepted abbreviations. You can always add them in OSM directly using name:short_name but I see how that does not scale to adding for every street in Russia. That means that we need country-specific generic abbreviation rules. There is an open issue about #94 for partial searches and #679 about the peculiarities of abbreviating street names.

The second problem is that Nominatim can't find the house number 100 without adding Minsk to the query because there are simply too many streets Независимости and due to the internal optimisation, we can't go through all of them to find the one that has a house number 100 attached. I thought we had a bunch of bug reports about that already but I can't find one right now, so leaving this issue open as a reminder of that. I'll adapt the title slightly to that end.

lonvia avatar May 10 '20 09:05 lonvia

The second problem is that Nominatim can't find the house number 100 without adding Minsk to the query because there are simply too many streets Независимости

Maybe add an optional input parameter to the search engine which gives the "context" of the query. In case of osm mainpage that would be the current viewport. It's obvious that the user who is looking at a city X in mapnik and types Y in search box is looking for object Y in city X. Search for that first, than fall back to parent country, then whole planet.

shrddr avatar Jun 07 '20 09:06 shrddr

it doesn't work for "независимости 100" even we restrict results with bbox https://nominatim.openstreetmap.org/search?q=%D0%BD%D0%B5%D0%B7%D0%B0%D0%B2%D0%B8%D1%81%D0%B8%D0%BC%D0%BE%D1%81%D1%82%D0%B8+100&limit=50&format=geojson&viewbox=27.33330%2C53.98395%2C28.10165%2C53.78159&bounded=1

kshmidman avatar Jun 09 '20 14:06 kshmidman

Should be fixed with the improved ranking algorithm in the new Python frontend.

lonvia avatar Sep 21 '23 08:09 lonvia