cortex icon indicating copy to clipboard operation
cortex copied to clipboard

Support time partitioned store gateway

Open yeya24 opened this issue 1 year ago • 5 comments

Is your feature request related to a problem? Please describe. Thanos supports something called time based partitioning https://thanos.io/tip/components/store.md/#time-based-partitioning on Store Gateway. This is a good way to do hot/cold data tiering as relatively recent data (within 1 month) in store gateway is accessed much more often than older data (older than 1 month).

Time partitioned store gateways can configure different replication factors and other configurations, which helps to reduce SG memory consumption. (Can use a larger replication factor for hot data and smaller RF for cold data).

Describe the solution you'd like There are multiple ways to do this but both requires some non trivial code changes.

  1. Querier supports multiple Store Gateway Ring and routes requests accordingly
  2. Querier still uses one SG ring, but X% of SGs in the ring are hot SGs and other SGs are cold SGs. Tiered SGs are basically two subrings and blocks will be assigned to the subrings based on the time range and block ID hash.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

yeya24 avatar Oct 03 '23 06:10 yeya24

@yeya24 are you working on this?

Akhil-2001 avatar Oct 12 '23 17:10 Akhil-2001

@Akhil-2001 I am not. If you are interested feel free to work on it!

yeya24 avatar Oct 12 '23 17:10 yeya24

@yeya24 Thanks! If you could direct me to how it is being done on thanos, that would provide me with a base for taking this up. Please feel free to let me know any design considerations which we would like to look at as well.

Akhil-2001 avatar Oct 13 '23 11:10 Akhil-2001

Given a query, the querier analyzes the start and end time range to compute a list of all known blocks containing at least 1 sample within this time range. Given the list of blocks, the querier then computes a list of store-gateway instances holding these blocks and sends a request to each matching store-gateway instance asking to fetch all the samples for the series matching the query within the start and end time range.

Hi!, if we shard storegw with time based partitioning, won't the start and end info of that query automatically route the querier to those storegws that cover these time ranges? @yeya24

pawarpranav83 avatar Jan 04 '24 16:01 pawarpranav83

@pawarpranav83 It is a feature from Thanos but not supported in Cortex yet. And it requires you to maintain two separate fleets of Store Gateway, which might not fit in Cortex's architecture.

yeya24 avatar Jan 07 '24 02:01 yeya24