cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Proposal: introduce read-only mode for ingesters

Open pracucci opened this issue 4 years ago • 8 comments

Recently we had to deal with a Cortex outage during which ingesters (running the blocks storage) were failing to compact head due to a in-memory corruption in TSDB (which has already been fixed in https://github.com/prometheus/prometheus/pull/7560). During the outage we had the need to stop ingesting samples on some ingesters, while keeping them running for the queriers in order to not loose series when querying, but unfortunately we haven't found any way to do it.

As a follow up action from this outage, I would like to propose to introduce in the ring the ability to mark an ingester as read-only. When manually marked as read-only, the ingester is ignored by distributors on the write path, while queries will continue query it.

Thoughts?

pracucci avatar Jul 27 '20 07:07 pracucci

I think this can be useful for operations, and storing read-only flag to ring seems logical.

Implementation-wise, as ring is used by different components (store-gateway, compactor, HA), I'm wondering if we should use some more generic "tags" instead, so that each instance can be marked with list of strings (tags), understood by the components. For ingesters, one such tag can be "read-only", which ingesters and distributors would understand, and admin UI would simply show it as a string.

pstibrany avatar Jul 27 '20 08:07 pstibrany

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 25 '20 09:09 stale[bot]

Still valid

pracucci avatar Sep 25 '20 10:09 pracucci

Today we had another incident where this feature would have been very useful.

pracucci avatar Oct 01 '20 11:10 pracucci

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 30 '20 11:11 stale[bot]

still valid

jtlisi avatar Nov 30 '20 14:11 jtlisi

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Feb 28 '21 16:02 stale[bot]

Still valid

pracucci avatar Mar 01 '21 08:03 pracucci