grpc-node
grpc-node copied to clipboard
Critical performance bottleneck with high-throughput GRPC stream handling
Problem description
Our current application utilizes this module within a Node.js backend environment, specifically designed to process events transmitted via a GRPC stream. The volume of these events is substantial, frequently reaching several thousands per second, and can potentially escalate to 10-15k events per second during peak periods.
Each event carries a timestamp indicating its original compilation time by the GRPC server. During our evaluations, we've identified a significant performance limitation with this module—it struggles to process beyond approximately 250 events per second. Moreover, a noticeable delay emerges rapidly, as indicated by logs comparing the current time against the event timestamps, leading to the processing of events that are significantly outdated, sometimes by several minutes.
This performance shortfall renders the task of managing such a high-volume stream through a single Node.js process impractical. Fortunately, our infrastructure includes powerful machines equipped with over 150 vcores and substantial RAM, enabling us to consider distributing the workload across multiple "consumer" sub-processes (be it through child_process
, worker_threads
, cluster.fork()
, etc.) in a round-robin configuration.
Node.js's introduction of worker threads and the cluster
module was a strategic enhancement to address such challenges, facilitating parallel request handling and optimizing multi-core processing capabilities. Given Node.js's proven capability to handle upwards of 20k transactions per second in benchmarks with frameworks like Express and Koa, it stands to reason that this scenario should be well within Node's operational domain.
However, it appears this module lacks support for such a distributed processing approach.
Inquiry
What is the optimal strategy for leveraging this module to handle thousands of events per second efficiently? Is there a method to employ Node.js's native cluster module to distribute the processing of these event transactions across multiple clustered instances in a round-robin manner, without duplicating events between processes?
Reproduction steps
- Access a high-throughput GRPC stream.
- Attempt to process this high-volume stream.
- Observe significant delays in event processing, with events not being processed timely.
Code used to test throughput:
const client = new XClient("endpoint", ChannelCredentials.createInsecure())
const stream = client.SubscribeXUpdates(new SubscribeXUpdatesRequest())
let start: number | undefined
let n = 0
stream.on("data", () => {
if (!start) start = Date.now()
n++
})
stream.on("end", () => {
console.error("Stream ended")
process.exit(1)
})
stream.on("error", (error) => {
console.error("Stream error", error)
process.exit(1)
})
setTimeout(() => {
if (!start) {
console.error("No data received from the stream.")
process.exit(1)
}
const msElapsed = Date.now() - start
const sElapsed = msElapsed / 1000
const rate = n / sElapsed
console.log("intake rate:", rate, "transactions/sec")
process.exit(0)
}, 30 * 1000)
Results after 30 seconds:
intake rate: 223.4826808496314 transactions/sec
While we acknowledge the challenge in replicating this specific scenario due to our event provider's closed-source nature, we can offer private access to our GRPC server endpoint and our protocol definitions for deeper investigation. Unfortunately, our ability to share further details is limited under these circumstances.
Environment
- Operating System: Windows 11 (16c/32t, 64GB RAM)
- Node.js Version: 20.10.1
- Also appears on Ubuntu 22 server (160 cores, 350GB RAM), with same node version.
- Docker is not being used.
Additional context
Should this module inherently be incapable of managing such high throughput, we suggest the inclusion of a disclaimer in the documentation to guide users with similar requirements, thereby preventing comparable challenges.