lucene
                                
                                 lucene copied to clipboard
                                
                                    lucene copied to clipboard
                            
                            
                            
                        LUCENE-10641: IndexSearcher#setTimeout should also abort query rewrites, point ranges and vector searches
Description
IndexSearcher only checks the query timeout in the collection phase for now. Need to add timeout support in case of other operations that may take time such as query rewrite, point ranges and vector searches. In this PR, we are covering the case of query re-write
Solution
Added timeout support in case of query rewrite operation in IndexSearcher.
Tests
Added UT for testing timeout during query re-write operation
Checklist
Please review the following and check all that apply:
- [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
- [x] I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
- [x] I have developed this patch against the mainbranch.
- [x] I have run ./gradlew check.
- [x] I have added tests for my changes.
@Deepika0510 You don't only need to wrap the IndexReader, you also need to wrap all its leaves.
@jpountz To wrap all the leaves, we would need to wrap ReaderContext classes along with LeafReader classes as well right? Since, we generally access the leaves through the ReaderContext class. Something like this:
reader().getContext().leaves().reader()
I don't think you need to wrap ReaderContext classes -- you can create your new TimeoutLeafReader class, subclassing FilterLeafReader, and overriding the methods (likely with additional wrapping on their returned objects?) that are needing added timeouts.
Then a call lke reader().getContext().leaves().reader() will return your TimeoutLeafReader but it will implement all necessary functionality of a LeafReader.
However, to get that TimeoutLeafReader in use, we would need to go through the ReaderContext class route(?). In a way we would need some mechanism in ReaderClass to know if timeout is applied and then return TimeoutLeafReader object.
Hmm I'm confused: why would you need to get to the TimeoutLeafReader?  Don't you create this timeout reader, passing the timeout to it (which will apply to all queries) and then you don't need to get to it anymore?  Just catch the timeout exception when running searching?
What I meant to ask is that after creating the TimeoutLeafReader class, how would we make sure that this wrapped class's object is used instead of any normal LeafReader instance?
E.g. when we created wrapped IndexReader then we modified the getIndexReader() method such that if timeout is applied then we make sure that we return the wrapped IndexReader object.
Similarly, we have to ensure the same for the TimeoutLeafReader as well right? That is where my doubt lies that since LeafReader is accessed through indexReader.getContext().leaves().reader() (like here) then shouldn’t we need to intercept in between this to return the object of TimeoutLeafReader?
Came across SoftDeletesDirectoryReaderWrapper where we have wrap method for wrapping underlying LeafReader.
However, I believe we will still have the problem when the leaves are directly accessed through ctx object like here. Is there any other way around? @mikemccand @jpountz
Hi @Deepika0510 -- what is the problem when callers access the leaves?  Since you would subclass FilterLeafReader (which subclasses LeafReader) it should be fine to existing code?  Like that line you linked to above will still be able to call .maxDoc().
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!