osm2pgsql icon indicating copy to clipboard operation
osm2pgsql copied to clipboard

Allow multiple source in osm2pgsql-replication

Open pinaraf opened this issue 3 years ago • 1 comments

Instead of assuming the replication_status table can contain only one source, make it possible to add multiple sources, and iterate on the table when running update. When replicating for instance Europe from geofabrik, if you want the whole french territories, this is a must-have : the europe.osm.pbf file does not include french territories like Réunion, Guyane or Guadeloupe. And likewise, their replication stream will not include these territories.

pinaraf avatar Sep 20 '22 15:09 pinaraf

I've tested this successfully, so I'm marking it ready for review, but there is one pending question in the change: should I add an osm2pgsql-replication drop command, that would remove one source or all sources of replication?

pinaraf avatar Sep 21 '22 06:09 pinaraf

I'm still undecided what to do with this. I certainly see the demand for importing/updating from multiple extracts ad the changes in this PR are simple enough and look okay. The problem is with the details. Updating multiple extracts only works when the diffs are all ultimately from the same state of the planet. It happens to work with the Geofabrik extracts if the diffs are all from the same day. So there are all kinds of corner cases which have a way of making the code more and more complicated.

So the better approach might be here to have a geofabrik replication script that takes care of proper synchronisation.

lonvia avatar Oct 04 '22 15:10 lonvia

So the better approach might be here to have a geofabrik replication script that takes care of proper synchronisation.

I thought about it, but I did not see the issue as long as imports are from distinct areas. What issue do you see with having diffs from different states?

pinaraf avatar Oct 04 '22 18:10 pinaraf

If you have data from adjoining areas there will be some overlap. There can also be overlap from relation members even if the areas aren't that close. And if you get different versions of the same object from the different input files, all sorts of problems can result.

joto avatar Oct 04 '22 19:10 joto

I'm closing this here. I'd rather see support for this on the pyosmium side to make sure it is implemented properly once and also works with the other pyosmium update scripts. I've opened https://github.com/osmcode/pyosmium/issues/214.

lonvia avatar Nov 03 '22 15:11 lonvia