n8n
n8n copied to clipboard
Kafka trigger poor performance in queue modo
Describe the bug I'm getting a poor performance on n8n even with queue mode because triggers are only processed by the main proccess
To Reproduce Steps to reproduce the behavior:
- run
sudo docker run --env-file=.env --rm --name=n8n-main -p 80:5678 registry.gitlab.com/nepuntobiz/nemobile/n8n:develop
- run
sudo docker run --env-file=.env -d --rm --name=n8n-webhook -p 5678:5678 registry.gitlab.com/nepuntobiz/nemobile/n8n:develop n8n webhook
- run
sudo docker run --env-file=.env -d --rm --name=n8n-worker registry.gitlab.com/nepuntobiz/nemobile/n8n:develop n8n worker --concurrency=32
- add a kafka node and push to kafka 50k records
Expected behavior Spread the task over the 32 worker process instead of running just in main (where is the UI)
Hello @mdbetancourt sorry for the performance issues.
There are actually 2 stages for the Kafka trigger. Considering it requires a persistent connection to Kafka, it means that actually all 50k executions would be started from n8n's main process - this is a limitation to "non-HTTP" related triggers in n8n.
The main process is the only one that can spark those executions. Once started, they should be handled to worker processes for the remainder of the execution (all other nodes) and then return back to the main process for wrap up.
Is this the behavior you are having? If so, unfortunately this is by design and we know this is a scaling issue at the moment that we are lookinig to address in the future.
@krynble yes it is, this issue even make the main process to crash and hang up,
after a while i get this
execution time: 2381
query is slow: INSERT INTO "public"."execution_entity
.....
....
<--- Last few GCs --->
[7:0x7f0db01553e0] 298278 ms: Mark-sweep 4033.4 (4138.7) -> 4016.2 (4137.2) MB, 2132.1 / 0.2 ms (average mu = 0.165, current mu = 0.125) task scavenge might not succeed
[7:0x7f0db01553e0] 300791 ms: Mark-sweep 4032.4 (4137.2) -> 4018.1 (4138.9) MB, 2332.5 / 0.1 ms (average mu = 0.120, current mu = 0.072) allocation failure scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Ah yes, I am sorry to hear that. You can temporarily tune some of the settings as part of the Kafka Trigger node, such as the maximum number of requests as shown below
This might help improve reliability at the cost of reducing throughput
Hey @mdbetancourt,
As this is something we are aware of and is by design for future improvement I am going to mark this one as closed and if needed we can open it again in the future. For now I suspect the setting in the node while not ideal will provide a way to work around it.
Let me know if you have any questions on this one.