superset icon indicating copy to clipboard operation
superset copied to clipboard

Global Async Queries with Redis Sentinel

Open himanshugarg574 opened this issue 3 years ago • 16 comments

Is your feature request related to a problem? Please describe. We want to use Redis sentinel with Global async query feature. Currently this feature only supports standalone redis instance which then becomes single point of failure.

Describe the solution you'd like We can either pass RedisSentinelCache/RedisCache instance to config or can have additional parameters for supporting sentinel.

Describe alternatives you've considered Standalone Redis works but it is not fault tolerant.

Additional context

himanshugarg574 avatar May 13 '21 12:05 himanshugarg574

any update on this?

radhakvnr avatar Oct 19 '21 06:10 radhakvnr

So, Is Redis sentinel the only or the recommended choice for superset? Can we go with Redis Shard cluster and not sentinel? other than HA is there anything that we miss, for not having Sentinel?

ranjitreddyb avatar Dec 20 '21 14:12 ranjitreddyb

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

stale[bot] avatar Apr 17 '22 20:04 stale[bot]

I'm guessing we can close this issue as (a) stale, and (b) a feature request more than a bug. @himanshugarg574 are you still interested in this feature? If so, we can migrate this issue to a feature request Discussion.

CC'ing/assigning @villebro in case he'd like to make a decision on closing/handling this issue, since he's taken an interest in this feature area.

rusackas avatar Jan 18 '23 17:01 rusackas

Sentinel support here would be a great feature, currently unable to use Async queries due to this limitation.

xenago avatar Feb 01 '24 16:02 xenago

Anyone on the thread here willing to contribute this change? This is kind of on the fence between being a bug and a feature request. We're trying to move feature requests to Github discussions so that we have a more actionable backlog of actual bugs here. I can: a) Move this to a Github Discussion if it's a feature request b) leave it open as a bug if it is one ¯\_(ツ)_/¯ c) close it as "not planned" if we don't have any volunteers after it being open for 2.5 years

rusackas avatar Feb 01 '24 16:02 rusackas

IMO it seems like a bug, since Sentinel is otherwise supported with Superset. It took a bit of digging for me to realize that plain redis is hard-coded in the config for async queries

xenago avatar Feb 01 '24 16:02 xenago

Cool... we'll just leave this open for now then, but if you want to open a PR, that would be awesome — I don't suspect it'll get fixed anytime soon otherwise.

rusackas avatar Feb 01 '24 19:02 rusackas

IMO it seems like a bug, since Sentinel is otherwise supported with Superset. It took a bit of digging for me to realize that plain redis is hard-coded in the config for async queries

@xenago oh yeah, I see what you mean. Fixing this should be straight forward. would you mind taking a stab at this? I assume we could just replace GLOBAL_ASYNC_QUERIES_REDIS_CONFIG with an instance of whichever backend the deployment wishes to use (redis.Redis for most cases). @rusackas @michael-s-molina I think this is one of those opportunistic breaking changes I think we should consider including in the scope of 4.0.

villebro avatar Feb 01 '24 19:02 villebro

We're trying to avoid opportunistic breaking changes, but it's now on the board to be considered for 5.0 :) The PR can most certainly be opened now, we'll just put a hold tag on it and have to rebase it until the 5.0 window opens.

rusackas avatar Feb 01 '24 19:02 rusackas

I assume we could just replace GLOBAL_ASYNC_QUERIES_REDIS_CONFIG with an instance of whichever backend the deployment wishes to use (redis.Redis for most cases).

I don't know if my understanding is 100 percent correct, but in redis-py it seems like Sentinel is used to discover the current master and acquire a redis.Redis session for actual use, e.g. master_for. So in a basic scenario, a redis.Redis instance acquired that way could be passed as a direct instance in GLOBAL_ASYNC_QUERIES_REDIS_CONFIG just like a plain redis connection would, but when a failover occurs that wouldn't leave a mechanism for a new client session to be established in its place.

Maybe either an instance of Sentinel or redis.Redis could be passed through as GLOBAL_ASYNC_QUERIES_REDIS_CONFIG, with AsyncQueryManager behaving differently based on the type? If a Sentinel instance was directly provided as GLOBAL_ASYNC_QUERIES_REDIS_CONFIG, then it would subsequently need to acquire a usable redis.Redis, but if a redis.Redis instance was provided as GLOBAL_ASYNC_QUERIES_REDIS_CONFIG then it could be used directly without any extra steps. Later if a communication failure occurs and AsyncQueryManager was set up with Sentinel, then an attempt could be made to establish a connection with a new master, unlike with bare redis.Redis.

xenago avatar Feb 01 '24 21:02 xenago

I may also be hazy on the details, but IIRC, Sentinel is a wrapper that maintains an instance to a Redis instance, which then gets swapped out during failover. For this reason, I think the correct solution would be to make the class configurable, so that GAQ could use any caching backend. Having said that, I seem to recall that the GAQ implementation uses some Redis specific logic, so it may well be that we can't use any arbitrary caching backend, like Memcached. But I would need to check this.

villebro avatar Feb 01 '24 21:02 villebro

Ah ok, thanks. I will test later this week to see if it works with a small patch to pass through a directly instantiated Sentinel

xenago avatar Feb 01 '24 21:02 xenago

A simple change would be adding something like this:

GLOBAL_ASYNC_QUERIES_REDIS_CLASS = redis.Redis

and then later use that, rather than the hardcoded class.

However, IMO the more elegant solution would be make the backend configurable, similar to RESULTS_BACKEND. So something like this:

GLOBAL_ASYNC_QUERIES_BACKEND: BaseCache | None = None

that could then be overridden locally:

GLOBAL_ASYNC_QUERIES_BACKEND = Sentinel(...)

Both should be backwards compatible (in the second option we could just deprecate GLOBAL_ASYNC_QUERIES_REDIS_CONFIG and use it to instantiate Redis if the config is defined).

villebro avatar Feb 01 '24 21:02 villebro

As a quick check, I edited async_query_manager.py to set self._redis directly as whatever was passed as GLOBAL_ASYNC_QUERIES_REDIS_CONFIG. With that change, setting a redis.Sentinel instance equal to that variable in superset_config.py I don't see any errors on startup; however, I also don't see anything special in the logs when e.g. dashboards are loaded. Is there a straightforward way to confirm if global async queries are working?

xenago avatar Feb 07 '24 17:02 xenago

Ah, I spoke too soon. The Sentinel instance doesn't seem to work as a naive drop-in replacement for redis.

'Sentinel' object has no attribute 'xrange'

image

xenago avatar Feb 08 '24 15:02 xenago

@xenago @villebro do we want to keep this open as a Superset bug, or is this a bit of a "feature request" edge case? I'm quite tempted to close this as stale, but wanted to check in with y'all first.

rusackas avatar Jun 03 '24 18:06 rusackas

Feature request sounds appropriate I think

xenago avatar Jun 03 '24 18:06 xenago

Ok, cool... I'll move it to an "Ideas" discussion, which is where these tend to go :D

rusackas avatar Jun 03 '24 20:06 rusackas