lucene Add target search concurrency to TieredMergePolicy

Closes https://github.com/apache/lucene/issues/12877

Adds a minimum number of segments to TieredMergePolicy. This allows to ensure that a minimum search concurrency parallelism is achievable using inter-segment parallelism.

May 27 '24 18:05 carlosdelest

Hi @jpountz , thanks for your thoughts!

Let me check whether I got your suggestion right:

There are at most 2 segment tiers. The first has minNumSegments at most, the second minNumSegments-1 to ensure we maintain minNumSegments total segments at the most.
The max number of docs allowed in a segment is (maxDocs - numDeletes) / minNumSegments. That avoids creating larger segments than necessary so we ensure there's minNumSegments segments at a minimum.

Does it align with your thoughts?

Some questions I have:

segsPerTier is not taken into account if minNumSegments is specified, correct?
In case segments reach maxMergedSegmentBytes, should we fall back to the previous segsPerTier behaviour?

I see that minNumSegments is not a good name with this approach - I think targetSearchConcurrency makes sense to be explicit.

Thanks!

May 29 '24 10:05 carlosdelest

There are at most 2 segment tiers.

Well, there can be more tiers, but since tiers have exponential sizes (e.g. if you merge factor is 10, each tier has segments that are 10x bigger than the previous tier) it's almost certainly fine to ignore segments on tiers beyond the 2nd higher tier, they would account to very few documents compared with the first 2 tiers.

segsPerTier is not taken into account if minNumSegments is specified, correct?

If minNumSegments is bigger than segsPerTier, then the highest tier should indeed allow for up to minNumSegments (or maybe minNumSegments-1 if there are multiple tiers, though it may introduce complexity for little value, maybe it's better/easier to keep things simple and allow for minNumSegments on the highest tier all the time). But lower tiers should still aim for segsPerTier segments.

In case segments reach maxMergedSegmentBytes, should we fall back to the previous segsPerTier behaviour?

I can understood this question in different ways, so I'll try to clarify how I think these two parameters should interact:

Even if minNumSegments is configured, we should still honor maxMergedSegmentBytes and not create segments that are bigger than that.
If your merge policy has segsPerTier = 10, maxMergedSegmentMB = 5GB, floorSegmentBytes = 100MB, minNumSegments = 12, and the total index size is 15GB, it should aim for allowedSegCount=23 segments (10 100MB segments ~=1GB, 14 1GB segments). So even though we could have created a segment of the maximum merged size by merging 10 1GB segments, we did not do it because then we would have ended up with a segment that had more than maxDoc/12 documents (assuming all docs contribute equally to the index size).

May 29 '24 12:05 jpountz

Thank you for tackling this @carlosdelest! What a hairy challenge ... TMP really is its own little baby chess engine, with many things it is trying to optimize towards (what the ML world now calls "multi-objective"), playing against a worthy opponent (the indexing application that keeps indexing different sized docs, doing some deletions/updates, refresh/commit, etc., producing all kinds of exotic segments). And of course in addition to the app exotically birthing new segments, TMP itself is causing new big segments to appear when each merge completes! This really is a game theory problem, heh.

+1 for the name targetSearchConcurrency -- it makes it clear this is about optimizing the "inter-segment concurrent search latency" in the resulting index, and, I think it also makes it clear that maxMergedSegmentBytes still wins. I think simply aiming for the largest tier to allow targetSearchConcurrency segment count when that exceeds segsPerTier is a good approach. After all it is those biggest segments that you really need for concurrency since searching them takes the most CPU / latency -- the lower tiers (1/10th, 1/100th, ... the size of the top tier) will be super fast by comparison and not the long pole in each query's latency.

May 29 '24 13:05 mikemccand

Thanks @mikemccand and @jpountz for your explanations! This is much clearer now.

I've updated the PR, LMKWYT about the approach!

I'm looking into effective ways of testing in terms of ensuring that the last tier contains the target number of segments for search concurrency. Any suggestions on how can I check that the last tier eventually contains the expected number of segments?

May 29 '24 19:05 carlosdelest

I spent some time digging into the new behavior introduced by this PR, as I'd like to get it over the finish line. :) I am getting disappointing results for the write amplification. This seems to be due to the fact that we compute the segment count as if segments could magically be close to the doc count limit, but in practice this is not the case, and it forces TieredMergePolicy to perform sub-optimal merging.

I tried to play with other approaches, and one that seems to perform closer to what I would expect consists of removing the targetConcurrency biggest segments from the list of eligible segments, in a similar way to how we remove too big segments. And then let the merge policy do merging on the long tail of smaller segments.

Jun 24 '24 14:06 jpountz

@carlosdelest I took the freedom of pushing some changes to your fork, I hope you don't mind!

Jun 24 '24 16:06 jpountz

FWIW I have been playing with BaseMergePolicyTestCase's doTestSimulateAppendOnly and doTestSimulateUpdates. With this change, I'm observing a number of segments in the order of targetSearchConcurrency on the highest tier while increasing the segment count by a bit less targetSearchConcurrency on average and without affecting the write amplification significantly.

Jun 24 '24 16:06 jpountz

Thanks for taking a look into this @jpountz !

Your approach makes sense to me, and addresses some of the problems I was having with bigger segments not being merged because of the number of documents, and smaller segments being discarded.

The test checks makes sense as well. I've added setting target search concurrency for testSimulateUpdates()

Is there any other work / checks we should be doing for this?

Jul 02 '24 14:07 carlosdelest

I would suggest improving LuceneTestCase#newTieredMergePolicy(Random) should randomly set the target concurrency on the merge policy. And remove the call to the setter to the BaseMergePolicyTestCase#testSimulate* test cases?

Done in 2d8031581ad1df603c59f5f02d9c0c9afed8ca28

Jul 16 '24 16:07 carlosdelest