Nominatim icon indicating copy to clipboard operation
Nominatim copied to clipboard

Ignore apostrophes when searching

Open harry-wg opened this issue 8 months ago • 3 comments

Is your feature request related to a problem? Please describe.

In the UK, many shops that are named after their founders end with 's (eg McDonald's, Sainsbury's) but many others end with just an s eg Morrisons. There is no obvious logic to it, but they are both pronounced exactly the same. Consequently, many people will use the wrong option when searching.

The search engines used in most browsers understand this is a common problem, and will return results regardless of whether the search word or the name ends s or 's. Nominatim is performing a similar function to web search engines. Consequently, it would generally be expected that nominatim should behave the same way.

Describe the solution you'd like

  • A search word ending ......s (eg Peters) should match the corresponding word ending either s or 's (ie Peters or Peter's)
  • A search word ending ......'s (eg Peter's) should also match the corresponding word ending either s or 's (ie Peters or Peter's)

Describe alternatives you've considered

An alternative, especially for longer names, would be to permit substring matches. So, search sainsb holborn and it will find the relevant Sainsbury's branches. Try that search in DuckDuckGo -- that's exactly what it does. Similarly, donalds holborn in DuckDuckGo returns all the McDonalds branches without the user having to guess whether it's spelled Mc or Mac.

Additional context

This could probably be implemented easily by stripping out any apostrophes in the search word and the searched names. Other examples this would catch are names like (DeAth or De'ath). Again these are both pronounced the same and there is no obvious logic why one person capitalises and another uses an apostrophe.

harry-wg avatar Apr 03 '25 00:04 harry-wg

I'm adding a specific example:

Sainsbury's, Brighton

"Sainsbury's, 361-367, Old Shoreham Road, Hangleton, Hove, Brighton and Hove, England, BN3 7GD, United Kingdom"
"Sainsbury's, Lewes Road, Hollingdean, Brighton, Brighton and Hove, England, BN2 3QB, United Kingdom"
"Sainsbury's, 93, Lewes Road, Hollingdean, Brighton, Brighton and Hove, England, BN2 3QA, United Kingdom"
"Sainsbury's, 27, New England Street, Round Hill, Brighton, Brighton and Hove, England, BN1 4GQ, United Kingdom"
"Sainsbury's Petrol, Old Shoreham Road, Hangleton, Hove, Brighton and Hove, England, BN3 7GD, United Kingdom"
"Sainsbury's, 100-104, Church Road, Brunswick, Hove, Brighton and Hove, England, BN3 2AE, United Kingdom"
"Sushi Gourmet, 27, New England Street, Round Hill, Brighton, Brighton and Hove, England, BN1 4GQ, United Kingdom"

Sainsburys, Brighton

no result

I'd see this as duplicate of https://github.com/osm-search/Nominatim/issues/3578

mtmail avatar Apr 03 '25 11:04 mtmail

First question to ask: is there a one-size-fits-all solution for apostrophs or do we need to consider each usage separately. The other prominent use that comes to mind is French (D'Artagnan, L'abbey, etc.). Not sure it handles the same.

Currently the apostroph is already removed during normalization and replaced with a "token space". That somewhat limits our options.

Option 1: contract 's to s during normalization in indexing and query time. That means "Saintbury's" and "Saintburys" will work but "Saintbury" will stop working. I'm sure there are cases where that is inconvenient. It surely wouldn't be a good solution for French articles.

Option 2: create variants during indexing time. We'd having "Saintbury's" and "Saintburys" and could find all three variants at the expense of making our already huge word list even larger. Similar to the suggestions in #3578, just the reverse variant creation process.

Option 3: create variants during search time. That's tricky because to have a general approach for the genitive s, you'd have to create variants for every word that ends in s.

lonvia avatar Apr 22 '25 15:04 lonvia

Fun Dutch edge case: St. John's Cathedral in 's-Hertogenbosch

mtmail avatar Apr 22 '25 17:04 mtmail

As said, it works very fast in OrganicMaps/CoMaps. Can’t their code be copied?

I tried it with command line tool fzf searching for mc donalds. Works. But way too slow, needs optimization. This is only for paris. Image

erik55 avatar Jul 15 '25 13:07 erik55

If you need a fuzzy search, you're better off using Photon

otbutz avatar Jul 15 '25 13:07 otbutz

If you need a fuzzy search, you're better off using Photon

Thanks for the idea, but even photon only finds a few mcdonalds in Paris and some around, but not all the others like mcdonald’s or mc donald’s or mcdonald's or mc donald's or mcdonalds or mc donalds or all other variations. OrganicMaps and CoMaps just work. Sad that not even photon does.

erik55 avatar Jul 15 '25 20:07 erik55