tilemaker icon indicating copy to clipboard operation
tilemaker copied to clipboard

Planet generation experiences

Open systemed opened this issue 3 years ago • 24 comments

I've successfully generated an .mbtiles from planet.osm.pbf. 🎉

Using current master plus #292. Planet renumbered first with osmium renumber. Command line:

tilemaker --input planet.osm.pbf --output planet.mbtiles --bbox -180,-60,180,75 \
    --compact --store /media/ssd/store

Execution time c. 37 hours (with the exception of deallocating, see below). This is a 16-threaded HP machine, 2x X5650, 144GB RAM.

Maximum memory usage c. 131GB, plus 267GB store. mbtiles filesize is 65GB.

What needs fixing?

  • [x] Deallocating is very slow, first in the .clear() method after reading the pbf, but more significantly at the end of running (where it takes hours). The .pbf is complete and usable but tilemaker is still nominally running after its "Filled the tileset with good things" message.
  • [x] Ijsselmeer and the Great Lakes are not showing. Screenshot 2021-09-13 at 01 04 15
  • [ ] Roundabouts don't show at lower zoom levels, causing a gap in the road. I suspect that they are simplified to 2-point lines and ultimately disappear entirely. We probably need to ensure that closed linestrings are never simplified below 4 points. Screenshot 2021-09-13 at 01 03 34
  • [x] Boundary lines are often discontinuous. This is because we don't yet support type=boundary relations, so any way that is a member of a type=boundary relation but not otherwise tagged will get dropped. Screenshot 2021-09-13 at 01 02 01

I suspect we might be able to get the running time down further too and optimise tile sizes, but the above issues are the most important.

systemed avatar Sep 13 '21 00:09 systemed

Yes the issue with the ijsselmeer we had before. The default style only renders water from osm up to zoom level 8. But large lakes are visible after zoom level 8. Some even up to zoom level 0.

kleunen avatar Sep 13 '21 06:09 kleunen

The cleanup at the end is not even needed. The data structures inside the mmap file get deallocated and then when all this is done. The complete file gets deleted anyway ..

kleunen avatar Sep 13 '21 06:09 kleunen

There's something very wrong at the end - the "Filled the tileset with good things" line is literally the last line of tilemaker.cpp, but it's been hung for 12 hours now since that was output.

Ijsselmeer and the Great Lakes are missing entirely - not even showing at z8+. I'll see if I can work out what's going on there (need to check it wasn't introduced by #292).

systemed avatar Sep 13 '21 07:09 systemed

"Filled the tileset with good things" is the last line in tilemaker.cpp, but after this all the destructors are run. I also ran a planet conversion, it did nog hang or take a lot of time to cleanup. I generated the planet by merging 4 extracts of the planet. So probably there is a deadlock now somewhere when cleaning up ?

I do have ijsselmeer in my extract: image

kleunen avatar Sep 13 '21 07:09 kleunen

That's interesting. I don't get Ijsselmeer at all when generating from netherlands-latest.osm.pbf, either with #292 or current master.

systemed avatar Sep 13 '21 08:09 systemed

I am not sure if I now used my adapted style for this.

kleunen avatar Sep 13 '21 08:09 kleunen

I've gone through commit-by-commit, and the problem starts with #307 (a9a54f9). Previous versions of the code rendered Ijsselmeer fine. Not sure yet what it might be in #307 causing it.

Edit: within #307, fe5229e is fine so it's something after that. IJsselmeer is read correctly by Lua which writes it to the water layer.

systemed avatar Sep 13 '21 10:09 systemed

maybe something with 'Use OsmID to lookup generated geometry'.

that would be the most likely candidate to cause this issue.

kleunen avatar Sep 13 '21 17:09 kleunen

I think i actually see the issue with this commit

Try #317, i have not tested it yet

kleunen avatar Sep 13 '21 17:09 kleunen

Changing that line to

osmID = (relationId & OSMID_MASK) | OSMID_RELATION;

does seem to work. We previously had --newWayId because way and relation IDs were sharing the same ID space, but if we're separating them bitwise anyway with OSMID_RELATION and OSMID_MASK, I don't think we need newWayId any more. Does that make sense or have I misunderstood?

systemed avatar Sep 13 '21 19:09 systemed

Yes, it does seem to be a collision between the OSMIDs. That is why i added the extra two bits and the | OSMID_RELATION. To make the osm id unique. So yes, that does make sense.

The & OSMID_MASK is just to make sure the id only takes 40 bits max.

kleunen avatar Sep 13 '21 20:09 kleunen

Ok - addressed in #318.

systemed avatar Sep 13 '21 20:09 systemed

I wanted to add my results regarding the boundary issue here learned from #322

Boundaries which are tagged as ways and not as relations are not being generated by tilemaker. For example these - https://www.openstreetmap.org/way/707454735 https://www.openstreetmap.org/way/136107215

Also, disputed borders are not being generated. Like this one - https://www.openstreetmap.org/way/133267807

kuwapa avatar Sep 22 '21 17:09 kuwapa

Yep. The issue is that tilemaker doesn't currently have any support for type=boundary relations, so anything that's tagged simply as a boundary relation (without the ways being tagged as boundaries) won't come through. I know how I'm going to fix it - the solution will enable other relation types like routes to be handled too - but haven't yet had the time to do so.

systemed avatar Sep 22 '21 17:09 systemed

Boundaries are addressed in #360.

systemed avatar Jan 05 '22 11:01 systemed

With the same setup as above (--compact, store on SSD, -60° to 75°), plus #386 with a renumbered locations-on-ways planet, the peak drops to 118GB RAM. That's a 10% saving from the previous 131GB even though we're reading more data (boundary relations), and brings it down to within the capacity of a 128GB machine. Store size is 274GB.

It's still processing - will update with timings later.

systemed avatar Feb 20 '22 13:02 systemed

Successfully completed for a 72GB mbtiles.

We have a performance regression in that it took 48hr09 (vs 37hr last time). The slowdown was very clearly in tile generation, and I think probably at higher zoom levels. Reviewing the commits since 13 Sep my best guess is that since #360, we're creating massive boundary (multi-)linestrings which have to be clipped for each output tile. In particular it might be https://github.com/systemed/tilemaker/blob/master/src/output_object.cpp#L112 where we are running geom::intersection over the whole MultiLinestring; we could filter individual segments much as we do for Linestring (above it in the code).

systemed avatar Feb 22 '22 11:02 systemed

#387 should hopefully fix the regression - I'll try another planet run in a few days.

systemed avatar Feb 24 '22 13:02 systemed

I recently converted a renumber'ed planet.osm.pbf using the latest master branch and stock OSM config and process files (it took about 26hrs on a 32 core system). Large lakes (eg Great Lakes) still don't show up until zoom level 6.

While reading the ocean shape file (obtained at https://osmdata.openstreetmap.de/download/water-polygons-split-4326.zip), there are several invalid shape entities reported as having invalid self-intersections, and others that have too few points. Is there a better/newer ocean shapefile set?

There seem to be "holes" in various places; these disappear at different zoom levels. The attached pictures are from the raw vector data using tileserver-gl around Sri Lanka. I would presume these occur because of the ocean shapefile errors mentioned above?

SriLanka SriLanka2

KevenRing avatar Mar 09 '22 16:03 KevenRing

Large lakes (eg Great Lakes) still don't show up until zoom level 6.

In the resources/process-openmaptiles.lua file there is a function somewhere around line 617 which is called when processing bodies of water. This function filters water from the output tiles based the size of the body of water. You'll also find that the biggest bodies of water are filtered at zoom level 6. You can change the zoom level for example to 3 which will render large lakes until zoom level 3.

Is there a better/newer ocean shapefile set?

The shape file is updated every day based on OSM data. So probably you can fix the polygons in OSM and your changes will reflect in the dataset the next day.

They also mention the following on the osmdata.openstreetmap.de website: The coastline in OpenStreetMap is often broken. The update process will try to repair it, but this does not always work. If the OSM data can't be repaired automatically, the data here will not be updated.

timsluis avatar Mar 23 '22 10:03 timsluis