Nominatim
Nominatim copied to clipboard
Support for numeric streets with and without ordinals
Currently nominatim will only return a result (checked in NYC) when a numeric street name includes an ordinal (59th st). It would be great it nominatim could support streets with and without (eg 59 Street) an ordinal. From experience in NYC, many users input numeric streets without an ordinal.
We'd need this to work for any language, not only English. Needs some investigation which languages actually make use of those ordinal suffixes.
I was poking around in the Carmen repo and found these JSON configs which may be useful: https://github.com/mapbox/geocoder-abbreviations/tree/master/tokens
The relevant entries are the ones marked as "type": "ordinal".
Libpostal also has cardinal and ordinal in https://github.com/openvenues/libpostal/tree/master/resources/numex
We've been discussing this for Pelias for a very long time, one of the problems we're having with elasticsearch is tokenising something like 152 -> ['one', 'hundred', 'fifty' 'two'] can cause false positives when searching for 'one', 'two hundred', 'fifty' or 'two', ordinals are hard 😿
Indeed, compiling the lists for ordinals is not too complicated. We can enlist the help of the community if necessary. The difficult part is in applying them correctly during data import and search. Nominatim's hard-coded replace list cannot handle the job. We need something more flexible and preferably something where we don't need to reimport the entire database when changing/adding to the list of ordinals. That's why it is on the 5.0.0 planning board only.
this is a crude workaround for english installs, but using this function from stackoverflow, i just replace with an ordinal when the second term of the address is a number.
def get_ordinal_suffix(n):
'''
Convert an integer into its ordinal representation::
make_ordinal(0) => '0th'
make_ordinal(3) => '3rd'
make_ordinal(122) => '122nd'
make_ordinal(213) => '213th'
'''
n = int(n)
if 11 <= (n % 100) <= 13:
suffix = 'th'
else:
suffix = ['th', 'st', 'nd', 'rd', 'th'][min(n % 10, 4)]
return str(n) + suffix
main code to execute
address = "123+45+st"
addrs = address.split('+')
if(addrs[1].isnumeric()):
addrs[1] = get_ordinal_suffix(addrs[1])
address = '+'.join(addrs)