kue icon indicating copy to clipboard operation
kue copied to clipboard

Redis connection lost and command aborted. code: 'UNCERTAIN_STATE'

Open pallavi2209 opened this issue 7 years ago • 11 comments

We have a node.js app deployed on Heroku and we are using kue to run some background jobs. It usually works fine, but few days back we saw following errors in our production server:

Jul 10 11:19:23 refocus app/web.6: Error removing 4628051 { AbortError: Redis connection lost and command aborted. It might have been processed. 
Jul 10 11:19:23 refocus app/web.6:     at RedisClient.flush_and_error (/app/node_modules/kue/node_modules/redis/index.js:357:23) 
Jul 10 11:19:23 refocus app/web.6:     at RedisClient.connection_gone (/app/node_modules/kue/node_modules/redis/index.js:659:14) 
Jul 10 11:19:23 refocus app/web.6:     at Socket.<anonymous> (/app/node_modules/kue/node_modules/redis/index.js:293:14) 
Jul 10 11:19:23 refocus app/web.6:     at Socket.g (events.js:286:16) 
Jul 10 11:19:23 refocus app/web.6:     at emitNone (events.js:91:20) 
Jul 10 11:19:23 refocus app/web.6:     at Socket.emit (events.js:185:7) 
Jul 10 11:19:23 refocus app/web.6:     at endReadableNT (_stream_readable.js:975:12) 
Jul 10 11:19:23 refocus app/web.6:     at _combinedTickCallback (internal/process/next_tick.js:74:11) 
Jul 10 11:19:23 refocus app/web.6:     at process._tickDomainCallback [as _tickCallback] (internal/process/next_tick.js:122:9) 
Jul 10 11:19:23 refocus app/web.6:   code: 'UNCERTAIN_STATE', 
Jul 10 11:19:23 refocus app/web.6:   command: 'EXEC', 
Jul 10 11:19:23 refocus app/web.6:   errors:  
Jul 10 11:19:23 refocus app/web.6:    [ { AbortError: Redis connection lost and command aborted. It might have been processed. 
Jul 10 11:19:23 refocus app/web.6:          at RedisClient.flush_and_error (/app/node_modules/kue/node_modules/redis/index.js:357:23) 
Jul 10 11:19:23 refocus app/web.6:          at RedisClient.connection_gone (/app/node_modules/kue/node_modules/redis/index.js:659:14) 
Jul 10 11:19:23 refocus app/web.6:          at Socket.<anonymous> (/app/node_modules/kue/node_modules/redis/index.js:293:14) 
Jul 10 11:19:23 refocus app/web.6:          at Socket.g (events.js:286:16) 
Jul 10 11:19:23 refocus app/web.6:          at emitNone (events.js:91:20) 
Jul 10 11:19:23 refocus app/web.6:          at Socket.emit (events.js:185:7) 
Jul 10 11:19:23 refocus app/web.6:          at endReadableNT (_stream_readable.js:975:12) 
Jul 10 11:19:23 refocus app/web.6:          at _combinedTickCallback (internal/process/next_tick.js:74:11) 
Jul 10 11:19:23 refocus app/web.6:          at process._tickDomainCallback [as _tickCallback] (internal/process/next_tick.js:122:9) 
Jul 10 11:19:23 refocus app/web.6:        code: 'UNCERTAIN_STATE', 
Jul 10 11:19:23 refocus app/web.6:        command: 'ZREM', 
Jul 10 11:19:23 refocus app/web.6:        args: [Object], 
Jul 10 11:19:23 refocus app/web.6:        position: 0 },  .............

We have seen these 'UNCERTAIN_STATE' errors before as well. I went through other related issues and it seems a network problem, but since it looks like it is coming from kue, I just wanted to confirm whether we have some potential options to prevent this in future.

pallavi2209 avatar Jul 20 '17 21:07 pallavi2209

Got exactly the same issue.

sylvainlap avatar Aug 08 '17 15:08 sylvainlap

Same here

shortcircuit3 avatar Mar 25 '18 19:03 shortcircuit3

Same here: bortError: Redis connection lost and command aborted. It might have been processed. at RedisClient.flush_and_error (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:357:23) at RedisClient.connection_gone (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:659:14) at Socket.<anonymous> (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:293:14) at Object.onceWrapper (events.js:313:30) at emitNone (events.js:111:20) at Socket.emit (events.js:208:7) at endReadableNT (_stream_readable.js:1055:12) at _combinedTickCallback (internal/process/next_tick.js:138:11) at process._tickDomainCallback (internal/process/next_tick.js:218:9) message: { AbortError: Redis connection lost and command aborted. It might have been processed. at RedisClient.flush_and_error (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:357:23) at RedisClient.connection_gone (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:659:14) at Socket.<anonymous> (/opt/icbc_eresumen/daemons/node_modules/kue/node_modules/redis/index.js:293:14) at Object.onceWrapper (events.js:313:30) at emitNone (events.js:111:20) at Socket.emit (events.js:208:7) at endReadableNT (_stream_readable.js:1055:12) at _combinedTickCallback (internal/process/next_tick.js:138:11) at process._tickDomainCallback (internal/process/next_tick.js:218:9) code: 'UNCERTAIN_STATE', command: 'BLPOP', args: [ 'q:pdf:jobs', 0 ] }

mzalazar avatar Jun 12 '18 19:06 mzalazar

I have found a limit with my server (max connextion): root@ar:/var/log# sysctl net.core.somaxconn net.core.somaxconn = 128 Using this command, i changed my settings sysctl -w net.core.somaxconn=1024 (you can "tune" this setting) By using: sysctl -p we apply new settings to kernel... and i had to restart redis of course.

mzalazar avatar Jun 12 '18 20:06 mzalazar

Also getting this issue. Is it a connections count issue?

danielmhanover avatar Mar 18 '19 01:03 danielmhanover

We are having the same issue? does anybody know if this can be related to the connections count?

elierrgm avatar Jun 24 '20 19:06 elierrgm

Just in case someone runs into the same issue, we were getting these errors

error: Error name: Error. Error message: Redis connection to localhost:6379 failed - read ECONNRESET error: AbortError: Ready check failed: Redis connection lost and command aborted. It might have been processed.

And in our case, they happened because we had reached the max number of clients allowed at a time.

elierrgm avatar Jun 24 '20 20:06 elierrgm

Hi, I have a project on GC with Redis/NodeJS and we are having the same issue. Some days we are having like 10.000 clients connected and errors like @elierrgm commented. Did you speak with GC support? What did they tell you? Is there a fix for this?

Thanks you very much for all the info you could give us.

ehelgueroredk avatar Oct 13 '20 14:10 ehelgueroredk

Hi ehelgueroredk, What solved the issue for us was reusing the connections instead of creating a new one every time an user makes a request.

elierrgm avatar Oct 15 '20 15:10 elierrgm

Hmmm, so the problem was on the code. And how do you do that? I mean, I have this code on a file and is imported on all routes I have:

import redis from 'redis'; import { promisify } from 'util'; import logger from '../../../config/logger'; const client = redis.createClient(process.env.REDIS_URL);

client.on('error', err => { logger.error(Redis error ${err}); });

client.on('connect', () => { logger.info('Redis connected'); });

export default { ...client, getAsync: promisify(client.get).bind(client), setAsync: promisify(client.set).bind(client), keysAsync: promisify(client.keys).bind(client), expireAsync: promisify(client.expire).bind(client), ttlAsync: promisify(client.ttl).bind(client), };

Thank you very much in advance.

ehelgueroredk avatar Oct 15 '20 15:10 ehelgueroredk

I don't know if this will work for produection but for me I just had to go into my redis folder (redis-6.0.10) and run the command that the TLS.md file suggests:

make BUILD_TLS=yes

then make test

in order to clean everything after you've made changes just run

make distclean

Here's the notes on running manually from the TLS.md file:

Running manually

To manually run a Redis server with TLS mode (assuming gen-test-certs.sh was invoked so sample certificates/keys are available):

./src/redis-server --tls-port 6379 --port 0 \
    --tls-cert-file ./tests/tls/redis.crt \
    --tls-key-file ./tests/tls/redis.key \
    --tls-ca-cert-file ./tests/tls/ca.crt

To connect to this Redis server with redis-cli:

./src/redis-cli --tls \
    --cert ./tests/tls/redis.crt \
    --key ./tests/tls/redis.key \
    --cacert ./tests/tls/ca.crt

This will disable TCP and enable TLS on port 6379. It's also possible to have both TCP and TLS available, but you'll need to assign different ports.

To make a Replica connect to the master using TLS, use --tls-replication yes, and to make Redis Cluster use TLS across nodes use --tls-cluster yes.

This is the sectoin on building Redis from the README.md:

Building Redis

Redis can be compiled and used on Linux, OSX, OpenBSD, NetBSD, FreeBSD. We support big endian and little endian architectures, and both 32 bit and 64 bit systems.

It may compile on Solaris derived systems (for instance SmartOS) but our support for this platform is best effort and Redis is not guaranteed to work as well as in Linux, OSX, and *BSD.

It is as simple as:

% make

To build with TLS support, you'll need OpenSSL development libraries (e.g. libssl-dev on Debian/Ubuntu) and run:

% make BUILD_TLS=yes

To build with systemd support, you'll need systemd development libraries (such as libsystemd-dev on Debian/Ubuntu or systemd-devel on CentOS) and run:

% make USE_SYSTEMD=yes

To append a suffix to Redis program names, use:

% make PROG_SUFFIX="-alt"

You can run a 32 bit Redis binary using:

% make 32bit

After building Redis, it is a good idea to test it using:

% make test

If TLS is built, running the tests with TLS enabled (you will need tcl-tls installed):

% ./utils/gen-test-certs.sh
% ./runtest --tls

If you are not using TLS here's how to run Redis with a default configuration:

Running Redis

To run Redis with the default configuration, just type:

% cd src
% ./redis-server

If you want to provide your redis.conf, you have to run it using an additional parameter (the path of the configuration file):

% cd src
% ./redis-server /path/to/redis.conf

It is possible to alter the Redis configuration by passing parameters directly as options using the command line. Examples:

% ./redis-server --port 9999 --replicaof 127.0.0.1 6379
% ./redis-server /etc/redis/6379.conf --loglevel debug

All the options in redis.conf are also supported as options using the command line, with exactly the same name.

Hope that helps !

cpalmer-ios avatar Jan 29 '21 13:01 cpalmer-ios