lucene icon indicating copy to clipboard operation
lucene copied to clipboard

LUCENE-10641: IndexSearcher#setTimeout should also abort query rewrites, point ranges and vector searches

Open Deepika0510 opened this issue 2 years ago • 10 comments

Description

IndexSearcher only checks the query timeout in the collection phase for now. Need to add timeout support in case of other operations that may take time such as query rewrite, point ranges and vector searches. In this PR, we are covering the case of query re-write

Solution

Added timeout support in case of query rewrite operation in IndexSearcher.

Tests

Added UT for testing timeout during query re-write operation

Checklist

Please review the following and check all that apply:

  • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • [x] I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • [x] I have developed this patch against the main branch.
  • [x] I have run ./gradlew check.
  • [x] I have added tests for my changes.

Deepika0510 avatar Jun 01 '23 19:06 Deepika0510

@Deepika0510 You don't only need to wrap the IndexReader, you also need to wrap all its leaves.

jpountz avatar Jul 20 '23 16:07 jpountz

@jpountz To wrap all the leaves, we would need to wrap ReaderContext classes along with LeafReader classes as well right? Since, we generally access the leaves through the ReaderContext class. Something like this:

reader().getContext().leaves().reader()

Deepika0510 avatar Sep 17 '23 10:09 Deepika0510

I don't think you need to wrap ReaderContext classes -- you can create your new TimeoutLeafReader class, subclassing FilterLeafReader, and overriding the methods (likely with additional wrapping on their returned objects?) that are needing added timeouts.

Then a call lke reader().getContext().leaves().reader() will return your TimeoutLeafReader but it will implement all necessary functionality of a LeafReader.

mikemccand avatar Oct 19 '23 14:10 mikemccand

However, to get that TimeoutLeafReader in use, we would need to go through the ReaderContext class route(?). In a way we would need some mechanism in ReaderClass to know if timeout is applied and then return TimeoutLeafReader object.

Deepika0510 avatar Oct 29 '23 11:10 Deepika0510

Hmm I'm confused: why would you need to get to the TimeoutLeafReader? Don't you create this timeout reader, passing the timeout to it (which will apply to all queries) and then you don't need to get to it anymore? Just catch the timeout exception when running searching?

mikemccand avatar Oct 29 '23 15:10 mikemccand

What I meant to ask is that after creating the TimeoutLeafReader class, how would we make sure that this wrapped class's object is used instead of any normal LeafReader instance?

E.g. when we created wrapped IndexReader then we modified the getIndexReader() method such that if timeout is applied then we make sure that we return the wrapped IndexReader object.

Similarly, we have to ensure the same for the TimeoutLeafReader as well right? That is where my doubt lies that since LeafReader is accessed through indexReader.getContext().leaves().reader() (like here) then shouldn’t we need to intercept in between this to return the object of TimeoutLeafReader?

Deepika0510 avatar Nov 02 '23 15:11 Deepika0510

Came across SoftDeletesDirectoryReaderWrapper where we have wrap method for wrapping underlying LeafReader. However, I believe we will still have the problem when the leaves are directly accessed through ctx object like here. Is there any other way around? @mikemccand @jpountz

Deepika0510 avatar Nov 02 '23 15:11 Deepika0510

Hi @Deepika0510 -- what is the problem when callers access the leaves? Since you would subclass FilterLeafReader (which subclasses LeafReader) it should be fine to existing code? Like that line you linked to above will still be able to call .maxDoc().

mikemccand avatar Nov 02 '23 17:11 mikemccand

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Jan 17 '24 00:01 github-actions[bot]

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

github-actions[bot] avatar Feb 20 '24 00:02 github-actions[bot]