pmtiles root/leaf index size issue ?
I've compared a full planet pmtiles file generated by pmtiles and one generated directly by tilemaker.
The nginx log show regular multi megabytes ranges requested on the timaker version, and nothing like that on the pmtile version.
I suspect the tile index not being hierarchical enough.
You can query both files here if you want to have a look :
- https://panoramax.openstreetmap.fr/pmtiles/tilemaker.pmtiles
- https://panoramax.openstreetmap.fr/pmtiles/planet.pmtiles
I think this is probably because tilemaker doesn't generate clustered pmtiles archives:
Setting the clustered property of a PMTiles archive means that the ordering of the tile data on disk matches the directories
Because tilemaker's tile generation is multi-threaded, and some tiles may take much longer to generate than others (due to complex geometries), we can't guarantee that tiles will be output in any particular order. Therefore we don't get the efficiency gains that a clustered archive would give.
For a clustered archive, you'd need to create an .mbtiles with tilemaker, then use go-pmtiles to turn that into a clustered .pmtiles.
There's some discussion of this in the original PMTiles PR, #620, in particular:
Threading means tilemaker's tile output order isn't sequential, so we can't set
.clustered
The only consequences here should the directories take up more bytes, and you can't use
pmtiles extracton an output. For cloud storage there isn't a huge locality advantage in accessing nearby parts of the same file
Thanks Richard, I'll go that way, mbtiles + pmtiles conversion.
One way to deal with unordered generation, is to add some inbetween queue. Threads fill the queue, another thread takes what is available and ordered in the queue to put that in the final pmtiles file. (easy to say, more work to implement it)
(easy to say, more work to implement it)
😁 Yes, you're right. I'm slightly anxious about a queue getting blocked on a tile with a really horrible multipolygon geometry (Saimaa or the US National Forests, that sort of thing) but there are possibilities for the future.
I created a WIP branch of a new pmtiles cluster CLI command: https://github.com/protomaps/go-pmtiles/pull/207 and printed out some diagnostics:
(Note: for small archives you will need the tilemaker main branch to get fix from #795)
for canada.pmtiles, which is 7 gigabytes total archive:
go run main.go cluster canada.pmtiles
total directory size 54945551
100% |██████████████████████████████████████████████| (14237172/14237172, 107585 it/s)
2025/01/31 12:17:48 convert.go:350: # of addressed tiles: 14237172
2025/01/31 12:17:48 convert.go:351: # of tile entries (after RLE): 14187464
2025/01/31 12:17:48 convert.go:352: # of tile contents: 14071412
2025/01/31 12:17:51 convert.go:367: Root dir bytes: 9538
2025/01/31 12:17:51 convert.go:368: Leaves dir bytes: 11472102
2025/01/31 12:17:51 convert.go:369: Num leaf dirs: 3464
2025/01/31 12:17:51 convert.go:370: Total dir bytes: 11481640
2025/01/31 12:17:51 convert.go:371: Average leaf dir bytes: 3311
2025/01/31 12:17:51 convert.go:372: Average bytes per addressed tile: 0.81
total directory size 11481640 (20.896396% of original)
So the savings are almost 80% total for a large country (~55 MB -> 11 MB), should also mean partial directory downloads are much smaller. Welcome to experiment on the PR for different areas.
the pmtiles cluster command is released in v1.26: https://github.com/protomaps/go-pmtiles/releases/tag/v1.26.0