lucene Multireader Support in Searcher Manager

Description

Copied from https://github.com/apache/lucene/issues/13975

I'd like to use MultiReader inside my searcher manager, but currently there is only support for DirectoryReader. Not sure about the context for. why this was the case initially, seems specific to some desired use-case https://issues.apache.org/jira/browse/LUCENE-6087.

Anyway should be a 1-line change here: https://github.com/apache/lucene/blob/9e90fb5bf1807f1d9ccf2cabf74b3235f2431ed4/lucene/core/src/java/org/apache/lucene/search/SearcherManager.java#L125

I found a relevant stack overflow question here: https://stackoverflow.com/questions/49817453/searchermanager-and-multireader-in-lucene

Nov 04 '24 16:11 Shibi-bala

Thanks @Shibi-bala -- I agree it's odd it was scoped to just DirectoryReader -- any IndexReader should work as long as it can openIfChanged on itself.

I think English.java (from Lucene's test-framework) was maybe deleted long ago? Maybe simplify the test to not bother with English words... just Integer.toString(i) or "" + i should be fine?

Also, please revert the wildcard import (import org.apache.lucene.index.*) -- I think our style checker (jtidy/spotless) will be unhappy with that.

Nov 04 '24 19:11 mikemccand

any IndexReader should work as long as it can openIfChanged on itself.

Does MultiReader implement openIfChanged() ? I see a check in SearcherManager#refreshIfNeeded() that asserts for the reader to be a DirectoryReader instance. This is used by the ReferenceManager base class whenever maybeRefresh() is called.

Nov 08 '24 07:11 vigyasharma

To add to @vigyasharma, I have been wondering if we should remove SearcherManager and encourage users to use IndexReaderManager. IndexSearcher is cheap to create and there are som reasons why it may be interesting to create a different IndexSearcher per query already, e.g. to tune parallelism based on current load, or to configure a timeout on the IndexSearcher.

Maybe MultiReader falls in the same bucket. Presumably, the MultiReader is made from multiple DirectoryReaders, so maybe the application code should not create a SearcherManager that works with a MultiReader but instead manage two (or more, one per Directory) ReaderManagers and dynamically create a MultiReader and then an IndexSearcher on top of this MultiReader on every search?

Nov 13 '24 08:11 jpountz

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Nov 28 '24 00:11 github-actions[bot]

I have been wondering if we should remove SearcherManager and encourage users to use IndexReaderManager

We could start encouraging people to use the existing ReaderManager. Both SearcherManager and SearcherTaxonomyManager create searchers on the new readers anyway, so it have the same overhead if any. But I suspect a lot of users consume the SearcherManagers and removing it would be a breaking change?

re: MultiReaders, I think Lucene could provide a convenient way to refresh on the multiple sub-readers. Like a reference managedMultiReaderManager that refreshes all sub-readers whenever refreshIfNeeded is invoked.

Apr 12 '25 20:04 vigyasharma

I hacked together a prototype impl. for a reference managed MultiReader. I can add some tests and clean it up if this meets our requirements. – https://github.com/apache/lucene/pull/14486

Apr 14 '25 20:04 vigyasharma

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Apr 29 '25 00:04 github-actions[bot]

Bumping up since the PR went stale. Should I finalize the changes in #14486 ?

May 11 '25 00:05 vigyasharma

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

May 25 '25 00:05 github-actions[bot]

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

Oct 16 '25 00:10 github-actions[bot]

lucene lucene copied to clipboard

Multireader Support in Searcher Manager

Description

lucene
lucene copied to clipboard