mimirsbrunn
mimirsbrunn copied to clipboard
Handle OSM Addresses
BANO and openadresses are pretty cool, but we need to get addresses from OSM too.
Here are a few things to keep in mind :thought_balloon:
Theory
An address is composed of 2 parts:
- name of the street
- housenumber in the street
Other elements may appear such as postal code, name of the city, ...
3 models coexist for the housenumber:
- as an isolated node
- as a building
- as a facade (entrance, gate...)
In any cases, an OSM object with a tag
addr:housenumber
will be used.
The name of the street can be found:
- in
addr:street
on the same object - as a member of a relation with type
associatedStreet
that has the name of the street in aname
- or both (in that case, the relation is better)
Among other interesting tags:
-
addr:housename
: used on residences and apartments blocks, when a same address points to several distinct identifiable buildings -
addr:suburb
: used to distinguish between duplicate addresses within a city
For more details, see https://wiki.openstreetmap.org/wiki/Key:addr
Postal codes
In most cases the postal codes are not zones, but sets of points.
The right place to find the postal code is thus the address (or the POIs), with the tag addr:postcode
.
If no postal code is provided on the address, we can look for the city's postal code.
Cities
The city is not often provided in the address, it's better to use geographic lookup.
Real life
Special cases
Addr:interpolation
This is an alternative model for addresses.
Addresses are represented by a way, parallel to the street. A few nodes on this way have housenumbers. The way contains interpolation parameters : odd/even numbers or both, use of letters, etc. E.g: https://www.openstreetmap.org/way/42230144#map=17/50.05627/9.65791
That may be ignored in our first version: we could only import the nodes as isolated points. (Does Mimir implement interpolation ?)
Keep in mind though, that this model is often used in Germany :de: , where OSM is the main data source. And in Sherbrooke :canada: too :stuck_out_tongue:
Others
Czechia :czech_republic: has pretty good addresses but uses its own mapping system, with a bunch of new tags. Russia :ru: can have multiple addresses for the same building, so they also have additional tags. In Japan :jp: , the street is only the space left between the blocks, so addresses are relative to block numbers and not to streets. Oh, and did I mention multilingual addresses :switzerland: ? So, dealing with the whole world is not going to be a piece of cake :cactus:
Duplicates
We will encounter duplicates :dancers: For instance:
- the same address as an isolated node and in a POI
- many POIs with the same address
- the same address mapped twice: as a isolated node and as a building (this is an error though)
So we will need to find a clever process to handle that. For instance, prefer node addresses to way ones, isolated nodes to POIs, etc
We will also have duplicates with OpenAddress. See issue 200.
What the others do
Pelias
Pelias read the objects with a addr:housenumber
and addr:street
tag in the same time as reading the POIs.
Then before giving a category to a POI, pelias duplicates the objects that are both POI and address.
The tags addr:housename
, addr:housenumber
, addr:street
and addr:postcode
are read. The administrative hierarchy is found by geographic lookup
They do not handle the associatedStreet
relation.
They have a tool to handle duplicates at import time but it seems deprecated. They remove the duplicates at query time. They do this in part because they want to be able to filter the data source and not loose data. They still have many duplicates, an issue has been opened on it.
OSMNames
Here is the OSMNames housenumber process: