lucene icon indicating copy to clipboard operation
lucene copied to clipboard

Provide the LeafSlice to CollectorManager.newCollector to save memory on small index slices [LUCENE-8542]

Open asfimport opened this issue 7 years ago • 17 comments

I have an index consisting of 44 million documents spread across 60 segments. When I run a query against this index with a huge number of results requested (e.g. 5 million), this query uses more than 5 GB of heap if the IndexSearch was configured to use an ExecutorService.

(I know this kind of query is fairly unusual and it would be better to use paging and searchAfter, but our architecture does not allow this at the moment.)

The reason for the huge memory requirement is that the search will create a TopScoreDocCollector for each segment, each one with numHits = 5 million. This is fine for the large segments, but many of those segments are fairly small and only contain several thousand documents. This wastes a huge amount of memory for queries with large values of numHits on indices with many segments.

Therefore, I propose to change the CollectorManager - interface in the following way:

  • change the method newCollector to accept a parameter LeafSlice that can be used to determine the total count of documents in the LeafSlice
  • Maybe, in order to remain backwards compatible, it would be possible to introduce this as a new method with a default implementation that calls the old method - otherwise, it probably has to wait for Lucene 8?
  • This can then be used to cap numHits for each TopScoreDocCollector to the leafslice-size.

If this is something that would make sense for you, I can try to provide a patch.


Migrated from LUCENE-8542 by Christoph Kaser, updated Mar 13 2019 Attachments: LUCENE-8542.patch

asfimport avatar Oct 24 '18 08:10 asfimport

Christoph Kaser (migrated from JIRA)

I attached a patch - hopefully this shows more clearly what I meant.

Since CollectorManager is marked as experimental, I think it might be possible to port this patch against Lucene 7 as well without providing a default implementation of the new method and keeping the old method.

asfimport avatar Nov 29 '18 14:11 asfimport

Adrien Grand (@jpountz) (migrated from JIRA)

This doesn't look like a great solution to your problem? If I were you, I'd probably rather fork TopScoreDocCollector to dynamically grow the heap.

asfimport avatar Nov 29 '18 21:11 asfimport

Lucene/Solr QA (migrated from JIRA)

+1 overall
Vote Subsystem Runtime Comment
Prechecks
+1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
master Compile Tests
+1 compile 1m 38s master passed
Patch Compile Tests
+1 compile 0m 51s the patch passed
+1 javac 0m 51s the patch passed
+1 Release audit (RAT) 0m 30s the patch passed
+1 Check forbidden APIs 0m 20s the patch passed
+1 Validate source patterns 0m 20s the patch passed
Other Tests
+1 unit 30m 12s core in the patch passed.
+1 unit 3m 10s facet in the patch passed.
40m 30s
Subsystem Report/Notes
JIRA Issue LUCENE-8542
JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12950019/LUCENE-8542.patch
Optional Tests compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname Linux lucene2-us-west.apache.org 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool ant
Personality /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision master / 75b1831
ant version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018
Default Java 1.8.0_191
Test Results https://builds.apache.org/job/PreCommit-LUCENE-Build/127/testReport/
modules C: lucene lucene/core lucene/facet U: lucene
Console output https://builds.apache.org/job/PreCommit-LUCENE-Build/127/console
Powered by Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

asfimport avatar Nov 30 '18 05:11 asfimport

Christoph Kaser (migrated from JIRA)

I think it would be nice to have the option to grow the heap dynamically. However the way TopScoreDocCollector and TopDocsCollector are currently built, for a lucene user that would mean copying the complete source code for those classes and adopting them to use a java.util.PriorityQueue (probably with worse performance than org.apache.lucene.util.PriorityQueue).

This is certainly possible, but would mean a lot of code duplication (from the perspective of a lucene user, because the used priority queue can't be changed easily),

I think that this patch makes sense anyway: The size of segments has a very wide range in a typical index, and usually there are a lot more small segments than large ones. Given that the default implementation of IndexSearcher.slices() returns one slice per segment, that means a lot of wasted memory for all queries that have a numHits greater than the typical size of a small segment. I don't think it has any negative impact on queries with a small value of numHits, because it only adds one Math.min per segment.

It also helps with my problem: for an index with 28 segments and 13,360,068 documents and a search with numhits=5,000,000, it makes the difference between creating priority queues with a combined size of 140,000,000 vs 13,360,068. As you can see in the following table, there are benefits for searches with a more reasonable numHits value as well (all against my index):

 

numHits Combined size w/o patch Combined size with patch
10,000,000 280,000,000 13,360,068
5,000,000 140,000,000 13,360,068
1,000,000 28,000,000 6,870,854
100,000 2,800,000 1,632,997
50,000 1,400,000 1,015,274
10,000 280,000 252,528

 

asfimport avatar Nov 30 '18 08:11 asfimport

Christoph Kaser (migrated from JIRA)

Is there anything I can change / add to get this committed? Or do you think it makes no sense for the general use case of lucene?

asfimport avatar Mar 12 '19 12:03 asfimport

Adrien Grand (@jpountz) (migrated from JIRA)

I think this goes against the use-case that Lucene is designed to solve indeed, which is to compute the top-k matches of a query for some reasonable value of k. I'd be ok with eg. having a collector in the sandbox that works better when collecting thousands of hits, but I'd like to avoid making changes to the core only to support this use-case.

asfimport avatar Mar 12 '19 12:03 asfimport

Christoph Kaser (migrated from JIRA)

That's too bad, given that this is only a minor change to an experimental API (and does not cause extra work in the reasonable use case). But I understand your reasons.

I may try to build such a collector when I find the time (though I suspect this may involve quite a lot of code duplication if no changes to the core should be made) - for now we simply limit the amount of concurrent queries with huge values of numHits so they fit into the heap.

asfimport avatar Mar 12 '19 13:03 asfimport

Michael McCandless (@mikemccand) (migrated from JIRA)

I think the core API change is quite minor and reasonable – letting the Collector.newCollector know which segments (slice) it will collect?  E.g. we already pass the LeafReaderContext to Collector.newLeafCollector so it's informed about the details of which segment it's about to collect.

 

I agree the motivating use case here is somewhat abusive, and a custom Collector is probably needed anyway, but I think this API change could help non-abusive cases too.

Alternatively we could explore fixing our default top hits collectors to not pre-allocate the full topN for every slice ... that is really unexpected behavior, and users have tripped up on this multiple times in the past causing us to make some partial fixes for it.

asfimport avatar Mar 12 '19 14:03 asfimport

Adrien Grand (@jpountz) (migrated from JIRA)

I don't think this change really helps as the number of documents in a slice is a pretty bad upper bound of the total number of documents that will be collected. If we want to better support this use-case, I'd rather like that we look into making PriorityQueue growable.

asfimport avatar Mar 12 '19 14:03 asfimport

Christoph Kaser (migrated from JIRA)

While it's true the slice size is a bad upper bound, the change does help: As you can see in the table in my comment, it reduces the heap requirement by 90% in my use case, due to the large number of small slices.

Making PriorityQueue growable would certainly be a better solution, however it is much harder to do this without affecting the "sane" use case performance.

asfimport avatar Mar 12 '19 14:03 asfimport

Michael McCandless (@mikemccand) (migrated from JIRA)

Maybe we should try swapping in the JDK's PriorityQueue and measure if this really hurts search throughput?

asfimport avatar Mar 12 '19 15:03 asfimport

Adrien Grand (@jpountz) (migrated from JIRA)

Right I get how it can help with small slices, but at the same time I'm seeing small slices as something that should be avoided in order to limit context switching so I don't think we should design for small slices?

I think the JDK's PriorityQueue would be fine most of the time, except in the worst case that every new hit is competitive, which happens when more recently indexed documents are more likely to be competitive (eg. when sorting by decreasing timestamp, which I'm seeing often) given that a single reordering via updateTop would need to be replaced with one pop and one push. Can we make our own PriorityQueue growable maybe? I'm not too concerned about performance given that only add() would be affected, not top() and updateTop() which are the ones that are important for performance.

asfimport avatar Mar 12 '19 18:03 asfimport

Lucene/Solr QA (migrated from JIRA)

+1 overall
Vote Subsystem Runtime Comment
Prechecks
+1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
master Compile Tests
+1 compile 0m 38s master passed
Patch Compile Tests
+1 compile 0m 39s the patch passed
+1 javac 0m 39s the patch passed
+1 Release audit (RAT) 0m 32s the patch passed
+1 Check forbidden APIs 0m 27s the patch passed
+1 Validate source patterns 0m 27s the patch passed
Other Tests
+1 unit 15m 30s core in the patch passed.
+1 unit 1m 2s facet in the patch passed.
19m 49s
Subsystem Report/Notes
JIRA Issue LUCENE-8542
JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12950019/LUCENE-8542.patch
Optional Tests compile javac unit ratsources checkforbiddenapis validatesourcepatterns
uname Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool ant
Personality /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
git revision master / d8f2a02
ant version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018
Default Java 1.8.0_191
Test Results https://builds.apache.org/job/PreCommit-LUCENE-Build/174/testReport/
modules C: lucene lucene/core lucene/facet U: lucene
Console output https://builds.apache.org/job/PreCommit-LUCENE-Build/174/console
Powered by Apache Yetus 0.7.0 http://yetus.apache.org

This message was automatically generated.

asfimport avatar Mar 13 '19 06:03 asfimport

Christoph Kaser (migrated from JIRA)

Right I get how it can help with small slices, but at the same time I'm seeing small slices as something that should be avoided in order to limit context switching so I don't think we should design for small slices?

Small slices are the default: The default implementation of IndexSearcher.slices() returns one slice per segment. Since the search runs in an Executor, this may not cause a lot of context switching depending on the thread pool parameters. But you are right, the default implementation of slices() may not be optimal.

 

 

asfimport avatar Mar 13 '19 07:03 asfimport

Adrien Grand (@jpountz) (migrated from JIRA)

the default implementation of slices() may not be optimal.

Let's change it?

asfimport avatar Mar 13 '19 09:03 asfimport

Michael McCandless (@mikemccand) (migrated from JIRA)

+1 to improve slices() to aggregate small slices together by default; that's what we are doing in our production service – we combine up to 5 segments, up to 250K docs in aggregate.

asfimport avatar Mar 13 '19 13:03 asfimport

I have once again run across this issue, with numHits == 4 000 000, even though the slices are now aggregated like Michael has suggested. I would still advocate passing the slice size to avoid allocating unnecessarily large buffers per slice when numHits is large. Any chance at all to get this merged?

ChristophKaser avatar Oct 21 '24 09:10 ChristophKaser