go
go copied to clipboard
Horizon Prerequisites Benchmarking
What problem does your feature solve?
We want to update the Horizon prerequisite documentation with the minimum hardware requirements for running the Horizon. To do this, we need to perform benchmarking and testing similar to what we've conducted previously.
What would you like to see?
Determine the minimum specifications required for the Horizon compute instance and the Postgres database instance with focus on measuring the memory, CPU, disk space, and IOPS requirements for both components.
What alternatives are there?
Some unknowns to investigate here. Notably, how much API load we assume that users will have. Potentially use existing goreplay setup but filtered/reduced depending on what we decide.
@urvisavla , during verification of compute resources, wanted to mention it should include ENABLE_CAPTIVE_CORE=true
and CAPTIVE_CORE_USE_DB=true
, I think those are the defaults at this point. Since, captive core with disk db usage will dramatically lower the amount of RAM used by captive, current pre-reqs in docs mention 32GB
of ram required, but with on-disk usage, that should be well under 8GB in almost all cases if not lower - https://github.com/stellar/go/pull/4092#issuecomment-1029355374
@urvisavla , during verification of compute resources, wanted to mention it should include
ENABLE_CAPTIVE_CORE=true
andCAPTIVE_CORE_USE_DB=true
, I think those are the defaults at this point. Since, captive core with disk db usage will dramatically lower the amount of RAM used by captive, current pre-reqs in docs mention32GB
of ram required, but with on-disk usage, that should be well under 8GB in almost all cases if not lower - #4092 (comment)
@sreuland We observed RAM usage on the ingestion instance (dev cluster) to remain below 8GB, usually hovering around 6GB. However, during state-verification, the RAM usage spikes to 11GB and remain so for the entire duration of state-verification. Meanwhile, the memory usage on the API instance (prod cluster) remains consistently below 3GB. I believe that the main contributor to memory usage is the in-memory graph for path payments.
Considering these observations and given that our recommendations are for an instance serving all functions (API + ingestion), 16GB RAM should be adequate. I will update our documentation to reflect this recommendation.
Update:
-
Shared a document with the team detailing observations from EC2 and RDS instances from the dev and prod clusters
-
Updated the hardware specifications, including CPU, memory, and disk, in our public docs within the partner-experience branch (to be merged to the main branch soon).
-
Unfortunately, couldn't obtain hardware benchmarks for running an API instance due to the absence of API traffic in both staging and dev clusters.
-
Explored options like using the 'go-replay' tool for mirroring traffic, but it proved to be unfeasible #2461.
-
Next steps: Explore developing a custom tool to simulate requests from prod (using logs from AWS) and replay them on dev cluster. And for that we'd want to use instances with specifications similar to what we plan to recommend in our public docs. Created ops request for provisioning new instances.
Created https://github.com/stellar/ops/issues/2536 request for provisioning new instances.
Hello @urvisavla , I left a comment for considertaion of using k8s for provisioning new instances rather than ec2: https://github.com/stellar/ops/issues/2536#issuecomment-1728517511
@urvisavla , you mentioned a performance benchmarks doc was shared, can it be linked or summ'd here also? Thanks!
@urvisavla , you mentioned a performance benchmarks doc was shared, can it be linked or summ'd here also? Thanks!
Sorry, I missed this earlier. Here is the doc.