operations icon indicating copy to clipboard operation
operations copied to clipboard

Load test tile servers

Open pnorman opened this issue 2 years ago • 1 comments

We've not hit the limits of the AMD based tile servers purchased last year, nidhogg and culebre, and we don't know what those limits are, which makes it difficult to capacity plan. I also would like more solid numbers on odin and ysera, the older tile servers.

We could load test by either increasing the load to one server at the CDN until it starts building a rendering queue, or pulling a server out and giving it a synthetic load. I'm inclined towards the former since synthetic tile loads are difficult.

Current peak backend traffic is 10k TPS across all servers, with 2K on each European server. The tile servers are in pairs with Odin and Nidhogg being one, Culebre and Ysera being the other, and traffic is directed to a different pair depending on tile coordinate.

My inclination is to do an initial test by re-weighting the culebre and ysera pair to direct an increasing amount of traffic at one of the servers. This would allow testing up to 4k TPS. If that's insufficient, I could temporarily direct all European traffic to one server, slowly ramping it up because right now it only has half the tiles.

pnorman avatar Jun 19 '22 03:06 pnorman

I did a partial test on odin this evening European time, boosting Odin's traffic from 896 TPS to 1758 TPS and it wasn't enough to cause a measurable change to CPU idle, or CPU/IO pressure, or to cause any missed tiles or queues.

Showing the importance of #527, adding 1% of the "odd" metatile traffic to odin, which were for tiles it didn't have freshly rendered, brought the idle CPU from about 65% to about 40%, showing that it really matters what tiles are requested.

pnorman avatar Jun 22 '22 20:06 pnorman