tilemaker
tilemaker copied to clipboard
Relation reading performance
I'm trying with the full planet right now, including #292 to reduce memory usage a bit, and node/way read performance is generally encouraging. (144GB RAM, 8GB swap, store on SSD.)
However, relation reading is glacially slow, probably unusably so. Nodes/ways were read overnight, but it looks like relations will take several days.
Relation performance is never going to be fast because we have to assemble, and correct, complex multipolygon geometries. Even so, we should be able to improve on this.
htop
shows most CPUs running at around 1%. I'm not sure how much of this is IO-bound: the status column is mostly S (interruptible sleep) rather than D (uninterruptible), which would suggest we're not IO-bound, but IO_RATE shows we're shipping 100M/s of data. The progress display often 'slips' back to an earlier block, which might indicate that it's getting stuck on particularly complex geometries.
A few thoughts:
- https://github.com/systemed/tilemaker/commit/432f39492253d975a909ce5f80c600cce95a716f might help - this uses a
std::deque
when adding member ways together rather than astd::vector
, because often they need to be added to the front of the current linestring - Worth resurrecting #258?
- In verbose mode, we could log relations that take more than
n
ms to parse, which might help pin down issues
I think it is the same thing as you see while reading the ways, there is not much RAM available anymore after loading the ways. For loading the relations the ways are needed, but they are being swapped in/out.
Trying with #292 applied, generating only -60° to 75°, and with all correct/make_valid calls commented out, it completes reading the .pbf overnight and begins generating tiles. Next step is to investigate whether this is because of reduced memory pressure or because of commenting out the calls.
With #292, #294 and #296 applied (for -60° to 75°) then reading the planet completes overnight even with all correct/make_valid calls.
at 144G ?
Yep; 8GB swap, --compact
, and --store
on SSD. Currently (slowly) generating the tiles.
But how much difference does it make in generating between -60 to +75 ? Are there really that many tiles in this region at z14 ? Due to the projection ?
Yes - as per #293, cutting out the polar regions means we save 13GB RAM on the OutputObjectRefs where the projection gets really stretched. Effectively I think these three PRs give us enough headroom that we can complete reading in 144GB without swapping too much.
No, i was refering to how the map looks. Is it obvious it is missing parts, or it looks fine to the user?
I don't know yet - it's still generating the tiles (it was only the .pbf reading that finished overnight). I'll let you know when it's done :)
But does #292 really lower memory consumption ? Because the ways are stored in the mmap file already, i wouldn't see why it would reduce memory consumption. It is possible, because there is less fragmentation in the mmap file, but you would have to measure to know for sure.
Also I noticed some memory gets deallocated from the mmap file. This indicates some temporary memory is allocated en deallocated, and this causes fragmentation also. Possibly, this can be optimized by removing temporaries inside the mmap file, by using std::move. There is no real need to deallocate anything stored in the mmap file.
See #300, it removes the temporary being allocated in the mmap.
I do need to do some measurements for #292 - different size .pbfs, and both with and without --store
. Without --store
it should provide a useful improvement, but it's possible that on the smallest .pbfs we should disable it (or use a Set rather than vector--store
- well, let's see!