thanos icon indicating copy to clipboard operation
thanos copied to clipboard

proxy: parallel merge and dedup

Open fpetkovski opened this issue 2 years ago • 2 comments

Is your proposal related to a problem?

Sorting series in the querier seems to be done using a single heap-like structure, which means we can only utilize one core for deduplication.

Describe the solution you'd like

It would be nice to be able to use multiple cores when fetching and deduplicating data in order to speed up query execution. Since the engine does not require series to be sorted, we could dedup using multiple heaps in parallel and iterate through all heaps when expanding series.

The only thing to be careful about is that series which are replicas of each other must end up on the same heap. This can be done by:

  • first removing replica labels from received series
  • hashmoding the rest of the labels
  • putting the series on the heap that corresponds to the value from the hashmod operation

Describe alternatives you've considered

All Thanos stores can now shard series dynamically by or without certain labels. We could therefore make multiple requests to stores, each for one specific shard, where a shard is calculated by hashmoding series without replica labels used in the query.

However, this will put more load on stores, and is not a good strategy when querying Prometheus sidecars.

@GiedriusS let me know if I am missing something obvious here that could be a blocker :)

fpetkovski avatar Sep 13 '22 08:09 fpetkovski

Maybe you could add a small diagram showing how all of this could work? Hard to understand now just from text :smile:

GiedriusS avatar Sep 14 '22 09:09 GiedriusS

Hm hard to explain with a diagram, but let me try with code links.

From what I see, we start a gorouting for each store that we query, and that goroutine returns a single respSet. Could we instead return N respSets, where each respSet contains disjoin series. The way we decide which series ends up in which respSet is by hashmoding series without their replicalabels, so that replicated series always end up in the same respSet. Now we can take these N respSets and build N heaps in parallel.

Hashmoding could happen here when we push a received series into the buffer.

When expanding series in the engine, they don't need to be sorted. So we can simply iterate through each series in each heap in order through some compound iterator. Does that help?

fpetkovski avatar Sep 14 '22 09:09 fpetkovski

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] avatar Nov 13 '22 15:11 stale[bot]

I'll give this a shot in a few weeks.

fpetkovski avatar Nov 14 '22 08:11 fpetkovski