lucene
lucene copied to clipboard
Multireader Support in Searcher Manager
Description
Copied from https://github.com/apache/lucene/issues/13975
I'd like to use MultiReader inside my searcher manager, but currently there is only support for DirectoryReader. Not sure about the context for. why this was the case initially, seems specific to some desired use-case https://issues.apache.org/jira/browse/LUCENE-6087.
Anyway should be a 1-line change here: https://github.com/apache/lucene/blob/9e90fb5bf1807f1d9ccf2cabf74b3235f2431ed4/lucene/core/src/java/org/apache/lucene/search/SearcherManager.java#L125
I found a relevant stack overflow question here: https://stackoverflow.com/questions/49817453/searchermanager-and-multireader-in-lucene
Thanks @Shibi-bala -- I agree it's odd it was scoped to just DirectoryReader -- any IndexReader should work as long as it can openIfChanged on itself.
I think English.java (from Lucene's test-framework) was maybe deleted long ago? Maybe simplify the test to not bother with English words... just Integer.toString(i) or "" + i should be fine?
Also, please revert the wildcard import (import org.apache.lucene.index.*) -- I think our style checker (jtidy/spotless) will be unhappy with that.
any
IndexReadershould work as long as it canopenIfChangedon itself.
Does MultiReader implement openIfChanged() ? I see a check in SearcherManager#refreshIfNeeded() that asserts for the reader to be a DirectoryReader instance. This is used by the ReferenceManager base class whenever maybeRefresh() is called.
To add to @vigyasharma, I have been wondering if we should remove SearcherManager and encourage users to use IndexReaderManager. IndexSearcher is cheap to create and there are som reasons why it may be interesting to create a different IndexSearcher per query already, e.g. to tune parallelism based on current load, or to configure a timeout on the IndexSearcher.
Maybe MultiReader falls in the same bucket. Presumably, the MultiReader is made from multiple DirectoryReaders, so maybe the application code should not create a SearcherManager that works with a MultiReader but instead manage two (or more, one per Directory) ReaderManagers and dynamically create a MultiReader and then an IndexSearcher on top of this MultiReader on every search?
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
I have been wondering if we should remove
SearcherManagerand encourage users to useIndexReaderManager
We could start encouraging people to use the existing ReaderManager. Both SearcherManager and SearcherTaxonomyManager create searchers on the new readers anyway, so it have the same overhead if any. But I suspect a lot of users consume the SearcherManagers and removing it would be a breaking change?
re: MultiReaders, I think Lucene could provide a convenient way to refresh on the multiple sub-readers. Like a reference managedMultiReaderManager that refreshes all sub-readers whenever refreshIfNeeded is invoked.
I hacked together a prototype impl. for a reference managed MultiReader. I can add some tests and clean it up if this meets our requirements. – https://github.com/apache/lucene/pull/14486
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
Bumping up since the PR went stale. Should I finalize the changes in #14486 ?
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!