mod_tile icon indicating copy to clipboard operation
mod_tile copied to clipboard

render_list does not save any tiles if killed due to running out of memory

Open SystemParadox opened this issue 8 months ago • 7 comments

Attempting to pre-render tiles for the UK using https://github.com/alx77/render_list_geo.pl:

./render_list_geo.pl -n 4 -x -9.5 -X 2.72 -y 49.39 -Y 61.26 -z 8 -Z 16

Output:

Rendering started at: Mon Mar 24 14:14:43 UTC 2025

render_list -a -z 8 -Z 8 -x 121 -X 135 -y 72 -Y 87 -n 4
Rendering client
Starting 4 rendering threads
Rendering all tiles from zoom 8 to zoom 8
Rendering all tiles for zoom 8 from (121, 72) to (135, 87)
Waiting for rendering threads to finish

*****************************************************
*****************************************************
Total for all tiles rendered
Meta tiles rendered: Rendered 0 tiles in 0.00 seconds (0.00 tiles/s)
Total tiles rendered: Rendered 0 tiles in 0.00 seconds (0.00 tiles/s)
Total tiles handled: Rendered 4 tiles in 0.00 seconds (1545.60 tiles/s)
*****************************************************
*****************************************************
*****************************************************

Zoom factor: 8 finished at
Mon Mar 24 14:14:43 UTC 2025
render_list -a -z 9 -Z 9 -x 242 -X 263 -y 145 -Y 175 -n 4
Rendering client
Starting 4 rendering threads
Rendering all tiles from zoom 9 to zoom 9
Rendering all tiles for zoom 9 from (242, 145) to (263, 175)
Waiting for rendering threads to finish

*****************************************************
*****************************************************
Total for all tiles rendered
Meta tiles rendered: Rendered 0 tiles in 0.00 seconds (0.00 tiles/s)
Total tiles rendered: Rendered 0 tiles in 0.00 seconds (0.00 tiles/s)
Total tiles handled: Rendered 12 tiles in 0.00 seconds (4876.07 tiles/s)
*****************************************************
*****************************************************
*****************************************************

Zoom factor: 9 finished at
Mon Mar 24 14:14:43 UTC 2025
render_list -a -z 10 -Z 10 -x 484 -X 519 -y 290 -Y 351 -n 4
Rendering client
Starting 4 rendering threads
Rendering all tiles from zoom 10 to zoom 10
Rendering all tiles for zoom 10 from (484, 290) to (519, 351)
Waiting for rendering threads to finish
connection to renderd lostsleeping for 30 secondsconnection to renderd lostsleeping for 30 secondsconnection to renderd lostsleeping for 30 secondsconnection to renderd lostsleeping for 30 seconds
Zoom factor: 10 finished at
Mon Mar 24 14:16:56 UTC 2025

I'm using https://github.com/Overv/openstreetmap-tile-server, and the renderd process is being killed due to running out of memory.

I have two issues with this:

  1. It doesn't seem to have saved anything? Why isn't it saving the tiles it managed to render before it gets killed?
  2. Running this for a smaller area works within the memory constraints, so surely it could be a bit smarter about this if there isn't enough memory available to do the whole task in one go?

SystemParadox avatar Mar 24 '25 16:03 SystemParadox

First things first, what does "render_list -V" say for the version number?

It doesn't seem to have saved anything

That seems odd. All "render_list" does is tell "renderd" to render tiles. In the example above I'd expect 12 z8 tiles would have been rendered. If you haven't got enough memory to render 4 z9 tiles concurrently I'd expect lots of other things to fail too. If it helps, I run "render_list_geo.pl/render_list_geo.pl -n 1 -z 3 -Z 12 -x -9.5 -X 2.72 -y 49.39 -Y 61.26 -m ajt" nightly without issues on a UK+Ireland database (one CPU not 4, and zooms up to 12 only).

SomeoneElseOSM avatar Mar 24 '25 17:03 SomeoneElseOSM

render_list -V (or -v or --version) all say:

render_list: invalid option -- 'V'
unhandled char '?'

However, dpkg says that libapache-mod-tile and renderd are both 0.6.1.

On our existing server we've been doing this for a long time using just the single render_list_geo.pl call (for everything from z8-z16), but it's possible it was initially rendered without any memory limit (could have used up to 32GB) and obviously it's just using the cached tiles now so it doesn't need much memory. We're trying to set this up on a new server - partly because the tile server container is so out of date that automatic updates are no longer working.

SystemParadox avatar Mar 24 '25 17:03 SystemParadox

That seems odd. All "render_list" does is tell "renderd" to render tiles.

If this is true then I'm concerned that this means renderd risks being OOM killed during normal operation? What is the proper way to limit renderd memory usage?

SystemParadox avatar Mar 24 '25 18:03 SystemParadox

What is the proper way to limit renderd memory usage

I'd suggest:

  • monitoring memory usage using whatever tools are available
  • restricting the number of threads that rendering uses (the switch2osm guide suggests 2 instead of the default of 4)
  • Checking that things like "dirty tile processing" are set up sensibly, if you're doing minutely updates.

How much memory is on the server (available to the container) and how big's the database?

SomeoneElseOSM avatar Mar 24 '25 18:03 SomeoneElseOSM

Limiting the number of threads seems to help a lot. Does this mean there is no shared memory between threads so if two threads try to render tiles for a similar area they'll both load everything into memory?

I still don't understand why this works ok for less threads or when splitting it up into smaller chunks, but doesn't work if you call render_list with a large area and many threads. Some more detail about how memory is allocated and freed would be helpful.

At the moment it feels a lot like render_list forces it to do the specified area in one massive request that has to be loaded completely into memory at once. I would expect it to just add all the relevant tiles to the render queue and render them in the same way as when someone loads them from the map?

SystemParadox avatar Apr 07 '25 14:04 SystemParadox

At the moment it feels a lot like render_list forces it to do the specified area in one massive request

render_list does indeed say "I would like all of these tiles to be rerendered" but it is your choice which tiles you ask for and (with the "-n" parameter) how many threads you want running in parallel doing that.

Does this mean there is no shared memory between threads so if two threads try to render tiles for a similar area they'll both load everything into memory?

There's likely some optimisation at the database side of things but in terms of the running threads, I doubt it.

With regard to "render_list -V"

render_list: invalid option -- 'V' unhandled char '?'

Slightly bizarrely, I can reproduce that (on each of Ubuntu 22.04, Debian 12 and Ubuntu 24.04) with the shipped-with-the OS versions, but not with versions I've built locally.

Did you get anywhere with monitoring memory use while rendering of tiles (either in response to browsing, you in response to render_list) was happening?

SomeoneElseOSM avatar Apr 07 '25 15:04 SomeoneElseOSM

Did you get anywhere with monitoring memory use while rendering of tiles (either in response to browsing, you in response to render_list) was happening?

Well yes I can limit the threads and the area to help keep it within the memory limits. But I still have two issues:

  1. render_list seems to force it to load everything into memory in an abnormal way and cause excessive memory usage unnecessarily.
  2. I'm assuming this is abnormal but I'm not entirely confident that it's not going to suddenly fall over in production if it gets a particularly expensive set of requests at once.

Again, it would be really helpful to have some more information about how memory is allocated and freed so we can better understand what's going on and how more confidence that it won't run out of memory.

SystemParadox avatar Apr 08 '25 10:04 SystemParadox