renku-python icon indicating copy to clipboard operation
renku-python copied to clipboard

properly use Redis sentinel in core service

Open Panaetius opened this issue 2 years ago • 4 comments

We currently only ask Sentinel once for a master address here and then use that to create a normal Redis connection here. This connection is not Sentinel-aware and if Redis master changes, core-service doesn't work anymore until it is restarted.

We should change the code so if REDIS_IS_SENTINEL==true, we do something like

sentinel = redis.SentinelManager((REDIS_HOST, REDIS_PORT), **config)
cache = sentinel.master_for("renku") # change to appropriate app name
model_db = Database(connection_pool=cache.connection_pool) # maybe also pass **config?

in BaseCache

Also check about changing the chart so we can pass more than one sentinel address for the REDIS_HOST/REDIS_PORT above.

Panaetius avatar Nov 14 '22 13:11 Panaetius

I think it might make sense to add REDIS_SENTINEL_HOSTS in the helm chart which could be in a format such as that shown here:

https://github.com/exponea/redis-sentinel-url#basic-usage

seanrmurphy avatar Nov 14 '22 13:11 seanrmurphy

Sentry issue: RENKU-PYTHON-KX

sentry-dev-app[bot] avatar Nov 14 '22 14:11 sentry-dev-app[bot]

Check with YAT on whether we have an agreement on where/how the list of hosts is defined in the helm charts.

Perhaps it is already at .global.redis.senitinelHosts or something like that. Until something like this is available this is blocked because we probably do not have enough information to make the list of sentinel hosts.

olevski avatar Nov 23 '22 14:11 olevski

This has not been defined as yet. My steps are as follows:

  • figure out how to remove redis from gitlab subchart (currently wip)
  • add new helm chart and values to renku helm chart after step above has been completed - for me that is a better basis for proceeding as I think the sentinel stuff is more straightforward there (pods are called node-0|1|2 and there is no notion of a master)
  • figure out how to deal with modifying the sentinel config

Regarding where this lives in the main renku chart, I suggested global.redis.sentinel.sentinelList as the place where sentinel hosts should be defined. I also suggested a format here.

I did not think about how this maps to the specific input to renku python.

I would also suggest that the behaviour should be a small delta: if sentinel is enabled and a list of sentinels is defined, then they should be picked up; otherwise do as before.

One final point: interacting with sentinel behind a Kubernetes service does work and giving a sentinel client the name of the service (and not an explicit list of sentinel pods) does work; however, I feel somewhat uncomfortable with this and would prefer to use a more explicit mechanism in which sentinel pods are listed. My reasoning there is that I can easily understand what is going on - adding the service abstraction with some round robin mechanism looks to me like a place where there could be strange failure scenarios.

The main reason I note this is that we can deploy the new redis helm chart with no master, renku services will continue to point to the renku-redis Kubernetes service and things will basically work (I've already tested this). Adding new sentinel configuration can be done subsequently and should result in greater reliability.

seanrmurphy avatar Nov 23 '22 15:11 seanrmurphy

Coordinate with @seanrmurphy

Panaetius avatar Jan 30 '23 13:01 Panaetius