tilemaker icon indicating copy to clipboard operation
tilemaker copied to clipboard

Relation reading performance

Open systemed opened this issue 3 years ago • 12 comments

I'm trying with the full planet right now, including #292 to reduce memory usage a bit, and node/way read performance is generally encouraging. (144GB RAM, 8GB swap, store on SSD.)

However, relation reading is glacially slow, probably unusably so. Nodes/ways were read overnight, but it looks like relations will take several days.

Relation performance is never going to be fast because we have to assemble, and correct, complex multipolygon geometries. Even so, we should be able to improve on this.

htop shows most CPUs running at around 1%. I'm not sure how much of this is IO-bound: the status column is mostly S (interruptible sleep) rather than D (uninterruptible), which would suggest we're not IO-bound, but IO_RATE shows we're shipping 100M/s of data. The progress display often 'slips' back to an earlier block, which might indicate that it's getting stuck on particularly complex geometries.

A few thoughts:

  • https://github.com/systemed/tilemaker/commit/432f39492253d975a909ce5f80c600cce95a716f might help - this uses a std::deque when adding member ways together rather than a std::vector, because often they need to be added to the front of the current linestring
  • Worth resurrecting #258?
  • In verbose mode, we could log relations that take more than n ms to parse, which might help pin down issues

systemed avatar Aug 24 '21 13:08 systemed

I think it is the same thing as you see while reading the ways, there is not much RAM available anymore after loading the ways. For loading the relations the ways are needed, but they are being swapped in/out.

kleunen avatar Aug 24 '21 13:08 kleunen

Trying with #292 applied, generating only -60° to 75°, and with all correct/make_valid calls commented out, it completes reading the .pbf overnight and begins generating tiles. Next step is to investigate whether this is because of reduced memory pressure or because of commenting out the calls.

systemed avatar Aug 25 '21 07:08 systemed

With #292, #294 and #296 applied (for -60° to 75°) then reading the planet completes overnight even with all correct/make_valid calls.

systemed avatar Aug 27 '21 09:08 systemed

at 144G ?

kleunen avatar Aug 27 '21 10:08 kleunen

Yep; 8GB swap, --compact, and --store on SSD. Currently (slowly) generating the tiles.

systemed avatar Aug 27 '21 10:08 systemed

But how much difference does it make in generating between -60 to +75 ? Are there really that many tiles in this region at z14 ? Due to the projection ?

kleunen avatar Aug 27 '21 10:08 kleunen

Yes - as per #293, cutting out the polar regions means we save 13GB RAM on the OutputObjectRefs where the projection gets really stretched. Effectively I think these three PRs give us enough headroom that we can complete reading in 144GB without swapping too much.

systemed avatar Aug 27 '21 10:08 systemed

No, i was refering to how the map looks. Is it obvious it is missing parts, or it looks fine to the user?

kleunen avatar Aug 27 '21 11:08 kleunen

I don't know yet - it's still generating the tiles (it was only the .pbf reading that finished overnight). I'll let you know when it's done :)

systemed avatar Aug 27 '21 11:08 systemed

But does #292 really lower memory consumption ? Because the ways are stored in the mmap file already, i wouldn't see why it would reduce memory consumption. It is possible, because there is less fragmentation in the mmap file, but you would have to measure to know for sure.

Also I noticed some memory gets deallocated from the mmap file. This indicates some temporary memory is allocated en deallocated, and this causes fragmentation also. Possibly, this can be optimized by removing temporaries inside the mmap file, by using std::move. There is no real need to deallocate anything stored in the mmap file.

kleunen avatar Aug 27 '21 18:08 kleunen

See #300, it removes the temporary being allocated in the mmap.

kleunen avatar Aug 27 '21 18:08 kleunen

I do need to do some measurements for #292 - different size .pbfs, and both with and without --store. Without --store it should provide a useful improvement, but it's possible that on the smallest .pbfs we should disable it (or use a Set rather than vector). With --store - well, let's see!

systemed avatar Aug 28 '21 11:08 systemed