API Load Testing for Ordinals API
Our goal here is to determine, for our infrastructure:
What is our requests per minute (RPM) limit for the Ordinals API?
When we're ready to load test against staging, ping DevOps to complete this ticket: https://github.com/hirosystems/devops/issues/1241
Waiting on the environment for this. ETA Monday latest to start this effort.
Ordinals API Load test first run - Aug 22nd after the environment was upgraded.
While running a test I noticed a large amount of 524 errors with a low amount of load, only 10 Vusers were used to run the test. After 20 mins CPU utilization spiked and the response time was extremely slow. I pulled Charlie in as I noticed only 1 cluster was getting all the traffic and he discovered the following.
I see the database CPU is maxing out. It’s using 3X more than it’s configured for. This is normally fine for short spikes, but not sustained usage. However, the ord-api deployment has very low resource usage, and isn’t even scaling out. These signs are pointing to the database being the bottleneck, most likely the queries.
Is the statement_timeout parameter being set in the Ordinals-API session with the pg db? If not, I can create a ticket for that. The db in staging is being bogged down from queries taking a very long time (~1 hour)
You can follow along in the #team-ordinals channel for more information, we came to the conclusion that some optimizations and configuration changes had to be made and Rafael would push a fix up to resolve this problem. Load testing will resume once this fix is pushed.