libpostal
libpostal copied to clipboard
road not detected for street terms in short form (in Russia)
Hi!
I was checking out libpostal, and saw something that could be improved.
My country is
RU
Here's how I'm using libpostal
rest api
Here's what I did - case 1
query=б-р Победы, д. 10
Here's what I got - case 1
[
{
"label": "house",
"value": "б-р"
},
{
"label": "road",
"value": "победы"
},
{
"label": "house_number",
"value": "д. 10"
}
]
Here's what I was expecting - case 1
[
{
"label": "road",
"value": "б-р победы"
},
{
"label": "house_number",
"value": "д. 10"
}
]
Here's what I did - case 2
query=б-р Солнечный 2
Here's what I got - case 2
[
{
"label": "house",
"value": "б-р солнечный"
},
{
"label": "house_number",
"value": "2"
}
]
Here's what I was expecting - case 2
[
{
"label": "road",
"value": "б-р солнечный"
},
{
"label": "house_number",
"value": "2"
}
]
Here's what I did - case 3
query=Савелкинский пр-д, д. 4
Here's what I got - case 3
[
{
"label": "house",
"value": "савелкинский"
},
{
"label": "road",
"value": "пр-д д."
},
{
"label": "house_number",
"value": "4"
}
]
Here's what I was expecting - case 3
[
{
"label": "road",
"value": "савелкинский пр-д"
},
{
"label": "house_number",
"value": "д. 4"
}
]
For parsing issues, please answer "yes" or "no" to all that apply.
- Does the input address exist in OpenStreetMap?
yes, but don't have the city context
- Do all the toponyms exist in OSM (city, state, region names, etc.)?
yes
- If the address uses a rare/uncommon format, does changing the order of the fields yield the correct result?
yes, expanding: ** б-р to бульвар ** пр-д to проезд
has helped and provided the correct results in the expected form (but only with the expanded values).
- If the address does not contain city, region, etc., does adding those fields to the input improve the result?
no
- If the address contains apartment/floor/sub-building information or uncommon formatting, does removing that help? Is there any minimum form of the address that gets the right parse?
no, it doesn't contain such info
Here's what I think could be improved
add б-р as an alias/synonym to бульвар add пр-д as an alias/synonym to проезд
Could probably be solved by changing here: https://github.com/openvenues/libpostal/blob/master/resources/dictionaries/ru/street_types.txt#L3-L4 to бульвар|бул|б-р bulvar|bul|b-r
Same for "пр-д".