abp icon indicating copy to clipboard operation
abp copied to clipboard

Better tolerate the redis server down

Open hikalkan opened this issue 6 years ago • 5 comments
trafficstars

We have handled exceptions if we fail to connect to a distributed cache server (see #762). If so, cache is not working at all and the application always use the actual data source (like querying the database). This is good if redis-server starts after our service.

However, in every attempt to use the cache, redis client spends more than 1 seconds to understand that the server does not response. If we heavily use the cache, it takes too much time. A solution could be wait ~30 seconds before next try to use the real cache (disabling the cache for ~30 seconds - or a configurable timeframe).

The exception thrown is:

It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail. SocketFailure on PING
StackExchange.Redis.RedisConnectionException: It was not possible to connect to the redis server(s); to create a disconnected multiplexer, disable AbortOnConnectFail. SocketFailure on PING
   at StackExchange.Redis.ConnectionMultiplexer.ConnectAsync(String configuration, TextWriter log) in c:\code\StackExchange.Redis\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 799
   at Microsoft.Extensions.Caching.Redis.RedisCache.ConnectAsync(CancellationToken token)
   at Microsoft.Extensions.Caching.Redis.RedisCache.GetAndRefreshAsync(String key, Boolean getData, CancellationToken token)
   at Microsoft.Extensions.Caching.Redis.RedisCache.GetAsync(String key, CancellationToken token)
   at Volo.Abp.Caching.DistributedCache`1.GetAsync(String key, Nullable`1 hideErrors, CancellationToken token) in D:\Github\abp\framework\src\Volo.Abp.Caching\Volo\Abp\Caching\DistributedCache.cs:line 97

I don't know (yet) what is a "disconnected multiplexer" and the effect of using AbortOnConnectFail option.

hikalkan avatar Feb 13 '19 07:02 hikalkan

@hikalkan Hi, Maybe #797 can resolve this?

hitaspdotnet avatar Feb 13 '19 08:02 hitaspdotnet

It can solve temporary redis down problems, we will use Polly in ABP. However, here, the problem is "trying to connect to redis" takes too much time and slows down the application.

hikalkan avatar Feb 13 '19 08:02 hikalkan

Same resolved problems:

StackOverFlow: it-was-not-possible-to-connect Github : InternalFailure on PING

hitaspdotnet avatar Feb 13 '19 09:02 hitaspdotnet

How to make Redis run (ABP 5.1)? This is what I'm getting in IdentityServer module:

StackExchange.Redis.RedisConnectionException: UnableToConnect on 127.0.0.1:6379/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 2s ago, last-write: 2s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 2s ago, v: 2.2.4.27433

ABP documentation does not say anything about Redis configuration besides: image

alexandis avatar Mar 29 '22 06:03 alexandis

@alexandis That references the StackExchange.Redis configuration options: https://stackexchange.github.io/StackExchange.Redis/Configuration.html

wub avatar Aug 10 '22 05:08 wub

I´m getting quite a few of this error in my Azure setup and I´m unsure what to do there.

No connection is active/available to service this operation: SET AbpInbox_Default; It was not possible to connect to the redis server(s). Error connecting right now. To allow this multiplexer to continue retrying until it's able to connect, use abortConnect=false in your connection string or AbortOnConnectFail=false; in your code. ConnectTimeout, mc: 1/1/0, mgr: 10 of 10 available, clientName: dw0sdwk000K6U, IOCP: (Busy=0,Free=1000,Min=1,Max=1000), WORKER: (Busy=2,Free=1021,Min=1,Max=1023), v: 2.2.4.27433

I don't know (yet) what is a "disconnected multiplexer" and the effect of using AbortOnConnectFail option.

Did you figure this out @hikalkan? If you are not sure there I don´t want to add it blindly (but it sounds like it should just try to connect if there is an issue until it can).

sturlath avatar Feb 03 '23 11:02 sturlath

Any reccomendations @hikalkan? Abp.io must have some best practice and some (retry?) tolerance mechanism in place (internally)?

sturlath avatar Feb 09 '23 07:02 sturlath