Nominatim
Nominatim copied to clipboard
Better filtering for irrelevant address parts for POIs
#2082 has implemented filtering of address parts, where the POI is outside the area of the address part. This solves the issue for bad addresses when the parent street goes through multiple administrative area in most cases. For performance reasons we do this only, when multiple areas of the same address level are part of the address list of a street. This leaves some corner cases, where there is only one area of a certain address level that is partially touched by the street, see discussion in #2649.
Instead of guessing, which areas need rechecking, it would also be possible to explicitly mark areas that need rechecking in the place_addressline
table. This would catch that case and make the rechecking a bit easier in general.
@lonvia I would be interested in working on this issue. Do you think someone new to the Nominatim project could handle this issue?
It's challenging but doable if you are not afraid to dive head-deep into pl/PgSQL code.
Here are some pointers to get you started:
- The basic idea would be to add a boolean column to place_addressline which is set to true when the area of the parent place does not cover a street completely. A good approximation for being a street object is probably: it is of type LineString.
- place_addressline is filled during indexing of places. The function insert_addressline is the relevant code. The tricky part here is that the parent objects are searched in the
location_area_large_*
tables. These tables contain chopped up geometries, so you can't use them. There is already a hack to load full geometries in the function here. That's the way to go. - The recheck if a POI point is really inside the area is hidden in this insane order statement.
- Tests for the dynamic address computation are in test/bdd starting here.
I recommend to start with writing a BDD test that describes the problem from #2649 and initially fails. Feel free to make a PR with the test only, if you want to have a second opinion if the test tests the right thing.
@lonvia I have written a test case, but I'm having trouble running some of the tests - please see #2697
^ @lonvia See the PR above for my first crack at the test case
Tried this in https://github.com/lonvia/Nominatim/tree/drop-outside-address-parts-backup and it turns out to be far too slow to check for containment. This first needs a clever idea how to reduce the number of geometry checks that have to be made.