moleculer
moleculer copied to clipboard
Broker.call waits forever if Redis transporter reconnects while transmitting a message
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest version
- [x] I checked the documentation and found no answer
- [x] I checked to make sure that this issue has not already been filed
- [x] I'm reporting the issue to the correct repository
Current Behavior
Broker.call
waits forever if Redis transporter reconnects while transmitting a message. Of course, we have requestTimeout
, but I don't think this is the correct behavior.
Expected Behavior
Returned Promise
must be rejected on connection close.
Failure Information
Example here: https://github.com/FFKL/moleculer-redis-freeze
Steps to Reproduce
Please provide detailed steps for reproducing the issue.
- Call remote action with large buffer via redis transporter
- Redis closes connection with
subscribe scheduled to be closed ASAP for overcoming of output buffer limits.
message - Returned
Promise
will always wait.
Reproduce code snippet
const { ServiceBroker } = require('moleculer');
const mainBroker = new ServiceBroker({
logger: console,
logLevel: 'debug',
nodeID: 'main',
transporter: 'redis://localhost:6379',
});
const brokerWithWorker = new ServiceBroker({
logger: console,
logLevel: 'debug',
nodeID: 'with-worker',
transporter: 'redis://localhost:6379',
});
brokerWithWorker.createService({
name: 'worker',
actions: {
async getBigMessage() {
return Buffer.alloc(60 * 1024 * 1024);
},
},
});
brokerWithWorker.start().then(async () => {
await mainBroker.start();
await mainBroker
.call('worker.getBigMessage')
// will wait forever!
.then((res) => mainBroker.logger.info(`Size: ${res.data.length}`))
.catch((err) => mainBroker.logger.error(`Error: ${err}`));
});
Context
Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.
- Moleculer version: 0.14.18
- Ioredis version: 4.28.0
- NodeJS version: 12.16.3
- Operating System: Ubuntu 20.04
Failure Logs
[2021-11-15T18:53:47.458Z] DEBUG with-worker/TRANSIT: <= Request 'worker.getBigMessage' received from 'main' node.
[2021-11-15T18:53:58.152Z] WARN main/TRANSPORTER: Redis-sub client is disconnected.
[2021-11-15T18:53:58.230Z] INFO main/TRANSPORTER: Redis-sub client is connected.
[2021-11-15T18:53:58.231Z] INFO main/TRANSPORTER: Setting Redis transporter
[2021-11-15T18:53:58.237Z] INFO main/TRANSPORTER: Redis-pub client is connected.
Thanks for the repro repo, it was really helpful to see what's happening under the hood.
Unfortunately, I don't think that it can be fixed on moleculer side. I'll try to explain what happens
-
main
node connects to Redis and calls theworker
-
worker
sends the "giant message" via send method https://github.com/moleculerjs/moleculer/blob/b48ec4f3d39951ad433a64c4c36ecdc2af824c24/src/transporters/redis.js#L133-L140 - The message is successfully sent to Redis. Redis does not throw any error message to
worker
node. So onworker
side everything is good. - Redis receives the message and checks its size. If the message size is larger than
output buffer
limit then Redis will close connection of all subscribers/consumers -
main
node reconnects to Redis but it doesn't know anything about the "giant message". Since the message is larger than theoutput buffer
it won't be sent anywhere. Because of this themain
node gets stuck forever.
The main issue is that Redis does not inform the client about the reason for closing the connection. The reason is only logged on the server-side. So even if we "break" the await mainBroker.call("worker.getBigMessage")
we don't know the exact cause. What would be the error that moleculer should throw?
I've searched for the scheduled to be closed ASAP for overcoming of output buffer limits
. Here's what I found https://gist.github.com/amgorb/c38a77599c1f30e853cc9babf596d634 The solution is basically to increase the output buffer size.
@AndreMaz Thank you for the explanation. It took me a few hours to figure out that this is a Redis failure so I decided to open this issue;) I thought it might be possible to fix this on the moleculer
side due to Promise
memory leak. Can we keep track of such pending Promises and show a warning message or configure Transporter
to reject these Promises?
My solution at the moment:
- Use
requestTimeout
- Increase output buffers limits
And the main problem is that in my case it is a heisenbug so Redis has also time limit - a soft limit of 32 megabytes per 10 seconds means that if the client has an output buffer bigger than 32 megabytes for, continuously, 10 seconds, the connection gets closed.
. The "giant message" sometimes can be successfully delivered.
It took me a few hours to figure out that this is a Redis failure so I decided to open this issue;)
Yeah, it's definitely a tricky situation. Without your repro repo it would have been really difficult to find the issue.
Can we keep track of such pending Promises and show a warning message or configure Transporter to reject these Promises?
We could track a pending Promise by adding a timeout but that is already done with requestTimeout
. Do you have any suggestions?
Can you use streams in this case?
@AndreMaz Unfortunately, transporter sends all chunks to Redis one by one. In this case, nothing changes. I've added the example to the repo.
I think, requestTimeout
is not the same as the potential cutAllPendingRequests
option (it does not reject connection instantly on close
event), but this is the only working solution right now.
I'm closing this issue because we don't know, how we can solve it inside Moleculer. If you have an idea or solution please reopen or open a PR.