tippecanoe
tippecanoe copied to clipboard
Determinism of result
I have a large set of different data sources coming from hundreds of different geojson files. Some of the resulting mbtiles are combined ones, but many are not. We are storing the output in a git repository - which makes it easy for local development and deployment.
Unfortunately, it is difficult to figure out which files have actually changed (git can't tell) because it appears the output is not deterministic. That's generating huge (unnecessary changes) in the git repository.
Generally, we're using a command like this to generate the result:
tippecanoe -o output.mbtiles -f --detect-shared-borders --base-zoom=6 --maximum-zoom=10 --simplification=8 -n="file description" input.json
Is there anything that can be changed to make this generate the same result, given identical inputs, each time?
Thanks!
For anyone running into the same issue, I've found a workaround - using version control on the source files and only generating new mbtiles where GIT says that the sources have changed within the last n
commits.
The thing that makes the mbtiles indeterminate from one run to the next is that different threads can complete in different orders, so the tiles can be added to the tiles
table in a different sequence. I don't know a way to guarantee that two sqlite files will be identical even when rows are added in the same order. But if you tippecanoe-decode
the two versions of the file and compare the output, you should be able to safely revert the new mbtiles file if it decodes exactly the same as the previous mbtiles file.