ioredis icon indicating copy to clipboard operation
ioredis copied to clipboard

Retrying Redis connection attempts keeps going to infinite when MaxRetriesPerRequestError is handled

Open kapalkat opened this issue 4 years ago • 19 comments

ioredis doesn't emit MaxRetriesPerRequestError when Redis error is down!~

ioredis ver: 4.17.3

This is not working in my production APP so I have prepared small test project to check if I am able to repeat the same and indeed I can. Code:

const Redis = require("ioredis");

function connectToRedis(redisProps) {
  const redisDefaultProps = {
    host: "127.0.0.1",
    port: "6379",
    db: 1,
    maxRetriesPerRequest: 20,
    retryStrategy(times) {
      console.warn(`Retrying redis connection: attempt ${times}`);
      return Math.min(times * 500, 2000);
    },
  };

  const g_redis = new Redis({ ...redisDefaultProps, ...redisProps });

  g_redis.on("connecting", () => {
    console.log("Connecting to Redis.");
  });
  g_redis.on("connect", () => {
    console.log("Success! Redis connection established.");
  });
  g_redis.on("error", (err) => {
    if (err.code === "ECONNREFUSED") {
      console.warn(`Could not connect to Redis: ${err.message}.`);
    } else if (err.name === "MaxRetriesPerRequestError") {
      console.error(`Critical Redis error: ${err.message}. Shutting down.`);
      process.exit(1);
    } else {
      console.error(`Redis encountered an error: ${err.message}.`);
    }
  });
}

connectToRedis();

I have tried to setup different versions of Redis and the problem appears no matter which I use. I have also tried to setup Redis on different hosts and I get the same error.

Scenario:

  1. Start Redis server
  2. Start the app with node index.js
  3. Connection is established
  4. Go to Redis server host and shutdown the Redis process with 'kill -9 Redis_PID' or stop the Redis service with 'sudo systemctl stop redis' (CentOS 7.5).
  5. ioredis detects following and is starting retryStrategy described in above code:
Retrying redis connection: attempt 18
Could not connect to Redis: connect ECONNREFUSED 127.0.0.1:6379.

Bug: The strategy keeps going. No MaxRetriesPerRequestError is emitted. The app is not stopped.

Could not connect to Redis: connect ECONNREFUSED 127.0.0.1:6379.
Retrying redis connection: attempt 27

Expected behavior: retryStrategy reaches maxRetriesPerRequest limit and emits MaxRetriesPerRequestError.

kapalkat avatar Jun 05 '20 09:06 kapalkat

I have found next test when MaxRetriesPerRequestError is not thrown. Start Redis as docker container. To simulate its error stop the container. ioredis is throwing following error: getaddrinfo ENOTFOUND redis redis:6379. RetryStrategy IS started, when it reaches maxRetriesPerRequest + 1 nothing happens. No MaxRetriesPerRequestError error is thrown.

I am starting to wonder if there is ANY scenario when this error is thrown...as I couldn't find one yet.

kapalkat avatar Jun 10 '20 10:06 kapalkat

@kapalkat

Try set the lazyConnect: true,

this.redis = new Redis({ db: 0, host: process.env.REDIS_HOST || 'localhost', port: process.env.REDIS_PORT || 6379, lazyConnect: true, });

after remake your test.

AhCamargo avatar Jul 08 '20 19:07 AhCamargo

Hey @AhCamargo

Tried your connection configuration and found no positive results; still the same issue.

kapalkat avatar Oct 15 '20 12:10 kapalkat

Hey @luin could you please take a look on that issue? It's quite serious as if Redis is essential for your microservices to work and you are not able to discover that the maxRetries is reached, you can't trigger you main app to shut down. It's really easy reproducible; required .js file provided in bug report.

kapalkat avatar Oct 15 '20 12:10 kapalkat

@kapalkat i am not an expert in this but i think maxRetriesPerRequest means requests to redis such as get request etc... I believe connecting to redis is not treated as request and only relies on retryStrategy function so you might want to change that function so that when attempts === 20 you return nothing.

goriunov avatar Dec 03 '20 23:12 goriunov

Hey @goriunov , thanks for your reply but the naming of the connection parameter and the error says something else: We have: maxRetriesPerRequest for a parameter and MaxRetriesPerRequestError for the error which does not appear. So if I clearly see that the counter (maxRetriesPerRequest) is decremented every connection attempt I expect that tthe lib will throw an error when it reaches defined number. Naturally I would expect that this error is: MaxRetriesPerRequestError

kapalkat avatar Dec 04 '20 12:12 kapalkat

There is indeed something fundamentally wrong around ioredis connection.

Following this sample

    try {
        const redisClient = new Redis({
            lazyConnect: true,
            connectTimeout: 5000,
            maxRetriesPerRequest: 3,
        });
        await redisClient.connect();

    } catch (e) {
        console.log('redis connection error', e);
    }

If there is an error with the connection to the redis server then it will fall in the catch statement, then it will keep looping forever throwing unhandled errors. Making an async application looping forever. The only way to quit the app at this point is to call process.exit. That makes error management with this library very hard and unreliable.

gsouf avatar Feb 17 '21 15:02 gsouf

Hey @luin any news on this? Did you have a chance to take a look on this issue?

kapalkat avatar Mar 10 '21 14:03 kapalkat

@kapalkat I'm using the quit method when the maximum number of retries is reached, that cancels the reconnect.

Example:

const connection = new Redis('redis://localhost:6379', { maxRetriesPerRequest: 3 })
connection.on('connect', () => console.log('connected'))
connection.on('disconnect', () => console.log('disconnected'))
connection.on('error', (error) => console.warn('got an error from redis', error))

connection.lpush('some-key', 'some-value').catch((e) => {
  if (e.name === 'MaxRetriesPerRequestError')
    connection.quit().then(() => {
      console.error('stopping due to too many retries on redis connection')
      process.exit(1)
    })
})

example output:

$ node test.js 
got an error from redis Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
got an error from redis Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
got an error from redis Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
got an error from redis Error: connect ECONNREFUSED 127.0.0.1:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) {
  errno: -111,
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 6379
}
stopping due to too many retries on redis connection

pvanagtmaal avatar Mar 17 '21 13:03 pvanagtmaal

Hey @pvanagtmaa, thanks for your reply. So you are saying that this error is not emitted automatically via retry strategy when retry counter reaches 0, but rather retry strategy is ONLY taking care of the retry counter and if it is 0 and then if I try to set/get something from redis, I get this error?

kapalkat avatar Mar 17 '21 13:03 kapalkat

@kapalkat by default yes, see https://github.com/luin/ioredis/blob/279f67eb18e6c20dc7d461fe6890ac22f63cd0fa/lib/redis/RedisOptions.ts#L40-L42

pvanagtmaal avatar Mar 17 '21 13:03 pvanagtmaal

@pvanagtmaal yee sorry I was not precise....

here "is not emitted automatically via retry strategy when retry counter reaches 0" I meant "is not emitted automatically when retry counter reaches 0". It doesn't need to be "via" retry strategy: counter "0" --> emit "MaxRetriesPerRequestError". I never see that error here:

g_redis.on("error", (err) => {
  if (err.name === "MaxRetriesPerRequestError") {
    console.error(`Critical Redis error: ${err.message}. Shutting down.`);
    process.exit(1);
  } else {
    console.error(`Redis encountered an error: ${err.message}.`);
  }
});

kapalkat avatar Mar 17 '21 13:03 kapalkat

I'm having a similar issue, and I've been able to track it down to two causes:

When debugging (either with DEBUG=* or --inspect) in vscode, the library works nearly as expected with incrementing backoffs. Every once in a while, it will trigger a race: image

Note that this is the 124th retry. Using the full power of my workstation, this gets triggered more often than not. Continuing from this point resets the retryAttempt counter.

I'm running BullMQ coupled onto an express application, and if the auth information is incorrect, our logs will get spammed with WRONGPASS invalid username-password pair or user is disabled. messages. My use case is that I would like BullMQ (and by extension ioredis) to gracefully degrade (i.e., warn and not work) if there is an authentication issue, leaving my primary service unaffected.

Regarding the first issue, when there is an error, the stream is ended which triggers a close event. That, in turn, spawns another connection request.

For reference, this is the test code I'm running:

require('dotenv').config();
const util = require('util');
const Redis = require('ioredis');

let count = 0;
const onError = (error) => {
  console.error(util.format('[%d] %s', count++, error));
  // console.log(error);
  // process.exit(1);
};

new Redis({
  host: process.env.REDIS_HOST,
  port: process.env.REDIS_PORT,
  username: process.env.REDIS_USERNAME,
  password: process.env.REDIS_PASSWORD,
  reconnectOnError(e) {
    return !e.message.includes('WRONGPASS');
  },
}).on('error', onError);

phatj avatar Mar 23 '21 22:03 phatj

Hi @phatj, Thanks for the details!

a race condition that will sometimes move from the connect to ready state, thereby resetting the retryAttempts to 0

The link is pointed to https://github.com/luin/ioredis/blob/master/lib/redis/index.ts#L385, which is an error listener. Not sure why that can move the state from connect to ready. Is the link correct?

Also, regarding your use case, I don't think reconnectOnError is necessary given by default if the password is wrong, ioredis will not enter ready state, and will reconnect.

luin avatar Mar 24 '21 04:03 luin

Any updates here? I'm facing the same issue. 😔

barbarafarias avatar Jun 01 '21 15:06 barbarafarias

Me too facing the same issue

lakshmipriyamukundan avatar Jul 29 '21 06:07 lakshmipriyamukundan

There is indeed something fundamentally wrong around ioredis connection.

Following this sample

    try {
        const redisClient = new Redis({
            lazyConnect: true,
            connectTimeout: 5000,
            maxRetriesPerRequest: 3,
        });
        await redisClient.connect();

    } catch (e) {
        console.log('redis connection error', e);
    }

If there is an error with the connection to the redis server then it will fall in the catch statement, then it will keep looping forever throwing unhandled errors. Making an async application looping forever. The only way to quit the app at this point is to call process.exit. That makes error management with this library very hard and unreliable.

@luin are there any plans to fix this? I guess connect should take care of retry, then throw when retry failed?

artur-ma avatar Oct 11 '21 10:10 artur-ma

Use this it worked for me retryStrategy(times) { console.log('retrying redis connection'); if (times > 10) { console.log(lost connection and exhausted attempts : ${times}); // End reconnecting with built in error return undefined; } // reconnect after return Math.min(times * 600, 6000); } };

reetusoodverma avatar Feb 21 '22 19:02 reetusoodverma

Any Solution for this issue, I'm also facing this issue.

pallavsharma avatar May 26 '22 09:05 pallavsharma

Hey guys 👋,

Sorry for the late response. Not sure why I missed the notifications. maxRetriesPerRequest is for commands, so if you set maxRetriesPerRequest to 20, and send a command with redis.set('foo', 'bar') when the server is down. ioredis will retry the command for 20 times until it returns a rejected promise. The option has nothing to do with how many retry attempts we want to make.

If you want to add a limit to retry, you can return a null in retryStrategy:

    retryStrategy(times) {
      if (times > 20) return null; // return null to stop retrying
      return Math.min(times * 500, 2000);
    },

Please refer to https://github.com/luin/ioredis#auto-reconnect for details. I'm closing now, but feel free to open a new open if there are other issues. Thanks for that!

luin avatar Feb 21 '23 10:02 luin