stacks-blockchain-api icon indicating copy to clipboard operation
stacks-blockchain-api copied to clipboard

NodeJS cluster workers (multiple cpu core support)

Open zone117x opened this issue 4 years ago • 5 comments

While working on CPU profiling support, it became apparent that various approaches have an impact on how future nodejs cluster support would work. I took a stab at implementing support and it turned out to be pretty straightforward (mostly due to Rafael's readonly mode work).

Clustering is disabled by default, and can be enabled with an env var STACKS_ENABLE_CLUSTER=true.

For context, see nodejs docs: https://nodejs.org/api/cluster.html#cluster_cluster

A single instance of Node.js runs in a single thread. To take advantage of multi-core systems, the user will sometimes want to launch a cluster of Node.js processes to handle the load. The cluster module allows easy creation of child processes that all share server ports.

zone117x avatar Oct 13 '21 18:10 zone117x

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/blockstack/stacks-blockchain-api/34PXhm6DACGkcAbNnPuquuBCzATR
✅ Preview: https://stacks-blockchain-api-git-feat-cluster-blockstack.vercel.app

vercel[bot] avatar Oct 13 '21 18:10 vercel[bot]

Codecov Report

Merging #801 (6919032) into develop (9f721c1) will increase coverage by 1.49%. The diff coverage is 21.56%.

:exclamation: Current head 6919032 differs from pull request most recent head e505e44. Consider uploading reports for the commit e505e44 to get more accurate results Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #801      +/-   ##
===========================================
+ Coverage    65.30%   66.79%   +1.49%     
===========================================
  Files           90       90              
  Lines         9101     9947     +846     
  Branches      1456     1688     +232     
===========================================
+ Hits          5943     6644     +701     
- Misses        3153     3298     +145     
  Partials         5        5              
Impacted Files Coverage Δ
src/index.ts 0.00% <0.00%> (ø)
src/shutdown-handler.ts 0.00% <0.00%> (ø)
src/api/init.ts 74.48% <32.35%> (-7.93%) :arrow_down:
src/api/routes/ws-rpc.ts 79.73% <100.00%> (ø)
src/helpers.ts 63.75% <100.00%> (+1.31%) :arrow_up:
src/api/routes/socket-io.ts 68.26% <0.00%> (-7.81%) :arrow_down:
src/datastore/postgres-store.ts 85.81% <0.00%> (+2.46%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 9f721c1...e505e44. Read the comment docs.

codecov[bot] avatar Oct 13 '21 19:10 codecov[bot]

Interesting addition! Would you expect this feature to be enabled in scenarios where only 1 or small number of APIs are serving traffic, or also in scenarios where many APIs are serving traffic? If serving traffic from many APIs, we don't typically give an API more than a single CPU. They will often be given a fraction of a single CPU.

Knowing this, do you think we would still benefit from this software clustering ability even when CPU resources are limited?

CharlieC3 avatar Oct 13 '21 19:10 CharlieC3

@CharlieC3

Interesting addition! Would you expect this feature to be enabled in scenarios where only 1 or small number of APIs are serving traffic, or also in scenarios where many APIs are serving traffic? If serving traffic from many APIs, we don't typically give an API more than a single CPU. They will often be given a fraction of a single CPU.

Knowing this, do you think we would still benefit from this software clustering ability even when CPU resources are limited?

I don't think there would be anything to gain by enabling this feature when only 1 (logical) cpu core is available. I do suspect that allocating >1 cores to deployments would end up with higher traffic capacity for the same cost, but I'm not familiar with the costs of different server resources. Seems worth testing out?

zone117x avatar Oct 14 '21 12:10 zone117x

Note: this feature needs testing to ensure the prom metrics are merged and reported correctly between the cluster processes -- right now it uses the default behavior from the prom lib, which from their docs imply a "best effort" implementation.

This feature isn't a high priority right now and not yet worth the testing, so going to icebox the PR.

zone117x avatar Nov 02 '21 11:11 zone117x

Closing this PR. If we want this feature in the future, we'd likely start over rather than trying to merge latest into this branch.

zone117x avatar Apr 20 '23 12:04 zone117x