planetiler icon indicating copy to clipboard operation
planetiler copied to clipboard

[FEATURE] OSM change file handling

Open msbarry opened this issue 3 years ago • 2 comments

We've been using pyosmium-up-to-date to catch a planet.osm.pbf file up to date with changes to OSM. That process takes about 1 minute per day of updates to download plus 40 minutes to rewrite planet.osm.pbf - almost as long as rendering all tiles for the planet.

In order to produce live planet.mbtiles as quickly as possible, planetiler should at least be able to read OSM change files as input and apply them to the PBF file being read. Even better would be if it could automatically download the necessary change files to bring an input planet up-to-date without external dependencies.

msbarry avatar Mar 06 '22 12:03 msbarry

I did some prototyping here, the rough approach was:

  • create a local sqlite DB with nodes, ways, relations tables and schema: id integer primary key, version integer, data blob where data is a binary encoding of the element info, metadata, and tags
  • find all OSM replication diffs required to bring a planet file up to date, and download/parse them in parallel, inserting into the appropriate table using version column to decide whether a diff should replace what's there or not (so we don't need to process change files in order)
  • then when iterating through the osm.pbf file, also iterate through the entries in the the each diff table (both ordered by ID) and apply any diff that exists for an element. When you reach the end of the osm.pbf entries, iterate through the remaining ones from the diff file since they are all new.

The sqlite db takes 1 minute to build and 1.5GB of storage per week of diffs (~2x the size of the compressed change files). It adds 1-2 minutes to each pass through the osm.pbf file - but incurs no memory overhead since it's just cursoring-through the sqlite result set while applying diffs.

It got a bit messy though since multiple threads read through the diffs without communicating so would be a good chunk of work to rewrite - I think the priority is pretty low given there's an existing workaround.

msbarry avatar Aug 02 '22 11:08 msbarry

Another use-case here would be applying daylight map distribution sidecars: https://daylightmap.org/

msbarry avatar Feb 21 '23 21:02 msbarry