Nominatim icon indicating copy to clipboard operation
Nominatim copied to clipboard

Support for numeric streets with and without ordinals

Open colinreilly opened this issue 10 years ago • 6 comments

Currently nominatim will only return a result (checked in NYC) when a numeric street name includes an ordinal (59th st). It would be great it nominatim could support streets with and without (eg 59 Street) an ordinal. From experience in NYC, many users input numeric streets without an ordinal.

colinreilly avatar Jul 16 '15 21:07 colinreilly

We'd need this to work for any language, not only English. Needs some investigation which languages actually make use of those ordinal suffixes.

lonvia avatar Aug 08 '15 11:08 lonvia

I was poking around in the Carmen repo and found these JSON configs which may be useful: https://github.com/mapbox/geocoder-abbreviations/tree/master/tokens

The relevant entries are the ones marked as "type": "ordinal".

missinglink avatar Nov 05 '20 22:11 missinglink

Libpostal also has cardinal and ordinal in https://github.com/openvenues/libpostal/tree/master/resources/numex

mtmail avatar Nov 05 '20 22:11 mtmail

We've been discussing this for Pelias for a very long time, one of the problems we're having with elasticsearch is tokenising something like 152 -> ['one', 'hundred', 'fifty' 'two'] can cause false positives when searching for 'one', 'two hundred', 'fifty' or 'two', ordinals are hard 😿

missinglink avatar Nov 05 '20 22:11 missinglink

Indeed, compiling the lists for ordinals is not too complicated. We can enlist the help of the community if necessary. The difficult part is in applying them correctly during data import and search. Nominatim's hard-coded replace list cannot handle the job. We need something more flexible and preferably something where we don't need to reimport the entire database when changing/adding to the list of ordinals. That's why it is on the 5.0.0 planning board only.

lonvia avatar Nov 06 '20 09:11 lonvia

this is a crude workaround for english installs, but using this function from stackoverflow, i just replace with an ordinal when the second term of the address is a number.

def get_ordinal_suffix(n):
    '''
    Convert an integer into its ordinal representation::

        make_ordinal(0)   => '0th'
        make_ordinal(3)   => '3rd'
        make_ordinal(122) => '122nd'
        make_ordinal(213) => '213th'
    '''
    n = int(n)
    if 11 <= (n % 100) <= 13:
        suffix = 'th'
    else:
        suffix = ['th', 'st', 'nd', 'rd', 'th'][min(n % 10, 4)]
    return str(n) + suffix

main code to execute

address = "123+45+st"
addrs = address.split('+')
if(addrs[1].isnumeric()):
    addrs[1] = get_ordinal_suffix(addrs[1])
    address = '+'.join(addrs)

littleprincefox avatar Aug 15 '22 17:08 littleprincefox