Adds search implementation of context aware segments
The implementation aims at pruning search space based on context awareness when context-aware-grouping mapper is presents. It does a best attempt extraction of grouping criteria from the query, if no grouping criteria is found it will search all segements
Dependent on Indexing PR (CIs will fail)
- https://github.com/opensearch-project/OpenSearch/pulls/RS146BIJAY
Related Issues
Resolves #[19093]
Check List
- [ ] Functionality includes testing.
- [ ] API changes companion pull request created, if applicable.
- [ ] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Release Notes
-
New Features
- Introduced context-aware search filtering capabilities that enable segment-level query optimizations based on context criteria.
- Added automatic criteria extraction from queries to support context-aware filtering without manual configuration.
- Enhanced script validation to support stored scripts in context-aware grouping configurations.
-
Tests
- Expanded test coverage for criteria-based filtering and context-aware query extraction functionality.
โ๏ธ Tip: You can customize this high-level summary in your review settings.
:x: Gradle check result for 0ece83d16d4c826e962e8120847aa63226f92246: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for e5db2d910a2881b358ff4ddbe91646c5a102c9b0: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 9b887d8414f51ce4181aa15fac34eddab3441d10: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for b6eb84e59cac8db2af1efa619762a90c1d587998: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 74ac3dcd919260317c6377fbe5f31d75aa1e217e: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ab40f3115a1476300d7e1c82a500263817b9b864: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 71b42b41156122c184138510313416747f96d08b: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 2b4cb4c3b9d842c1bc76217ebf85e74a52bf31f6: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Walkthrough
This pull request introduces context-aware criteria-based segment filtering infrastructure for OpenSearch. It adds APIs for extracting grouping criteria from queries, filtering directory readers by criteria, propagating criteria through the search stack (Engine, IndexShard, SearchService), and new query builder interfaces to support segment-level optimization.
Changes
| Cohort / File(s) | Summary |
|---|---|
Core Directory Reader Context Support server/src/main/java/org/opensearch/common/lucene/index/OpenSearchDirectoryReader.java |
Adds contextAwareReadersLeafReaderMap and methods to build/retrieve criteria-based filtered readers; introduces ChildDirectoryReader inner class with cache key isolation; extends DelegatingCacheHelper and DelegatingCacheKey constructors for per-criteria cache management |
Searcher Acquisition API Extensions server/src/main/java/org/opensearch/index/engine/Engine.java, server/src/main/java/org/opensearch/index/shard/IndexShard.java |
Extends acquireSearcherSupplier and acquireSearcher methods with optional context-aware grouping criteria parameters; propagates criteria through searcher acquisition calls to enable segment-level filtering |
Context-Aware Criteria Extraction server/src/main/java/org/opensearch/search/contextaware/ContextAwareCriteriaQueryExtraction.java, server/src/main/java/org/opensearch/search/contextaware/package-info.java |
New class to analyze query builders and extract context-aware criteria via recursive query traversal; handles TermQuery, TermsQuery, WithFilterQueryBuilder, and BoolQuery with precedence logic; supports script-based transformations |
Query Builder and Field Mapper Enhancements server/src/main/java/org/opensearch/index/query/WithFilterQueryBuilder.java, server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java |
Adds WithFilterQueryBuilder interface for filter component access; relaxes ContextAwareGroupingFieldMapper script validation to allow STORED scripts regardless of language |
Search Service Integration server/src/main/java/org/opensearch/search/SearchService.java |
Imports and integrates ContextAwareCriteriaQueryExtraction; adds getContextAwareGroupingCriteria helper to conditionally extract criteria when enabled; passes criteria to searcher acquisition throughout search execution |
Test Infrastructure test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java, test/framework/src/main/java/org/opensearch/script/MockScriptEngine.java |
Extends MapperTestUtils with ScriptService parameter propagation; updates MockScriptEngine to handle ContextAwareGroupingScript return value conversion via String.valueOf() |
Comprehensive Test Coverage server/src/test/java/org/opensearch/common/lucene/index/OpenSearchDirectoryReaderTests.java, server/src/test/java/org/opensearch/search/contextaware/ContextAwareCriteriaQueryExtractionTests.java |
Adds testCriteriaBasedReaders covering segment filtering and caching; introduces ContextAwareCriteriaQueryExtractionTests with extensive coverage of query analysis, script execution, and boolean query precedence |
Sequence Diagram
sequenceDiagram
participant Client
participant SearchService
participant ContextAwareExtraction
participant Engine
participant IndexShard
participant DirectoryReader
Client->>SearchService: execute search with query
activate SearchService
SearchService->>ContextAwareExtraction: extractCriteria(query)
activate ContextAwareExtraction
ContextAwareExtraction->>ContextAwareExtraction: traverse query tree<br/>(TermQuery, TermsQuery,<br/>BoolQuery, WithFilterQueryBuilder)
ContextAwareExtraction-->>SearchService: return Set<String> criteria
deactivate ContextAwareExtraction
SearchService->>IndexShard: acquireSearcherSupplier(..., criteria)
activate IndexShard
IndexShard->>Engine: acquireSearcherSupplier(..., criteria)
activate Engine
alt criteria present and non-empty
Engine->>DirectoryReader: getCriteriaBasedReader(criteria)
activate DirectoryReader
DirectoryReader->>DirectoryReader: warmUpCriteriaBasedReader()<br/>scan segments for bucket metadata
DirectoryReader->>DirectoryReader: createChildDirectoryReader()<br/>filter segments matching criteria
DirectoryReader-->>Engine: filtered DirectoryReader
deactivate DirectoryReader
else no criteria
Engine->>DirectoryReader: standard acquire()
DirectoryReader-->>Engine: standard DirectoryReader
end
Engine-->>IndexShard: SearcherSupplier
deactivate Engine
IndexShard-->>SearchService: SearcherSupplier
deactivate IndexShard
SearchService->>SearchService: execute search with filtered reader
deactivate SearchService
SearchService-->>Client: search results
Estimated code review effort
๐ฏ 4 (Complex) | โฑ๏ธ ~60 minutes
- ContextAwareCriteriaQueryExtraction: Recursive query analysis with multiple edge cases, precedence rules, and script execution logic across query types (TermQuery, TermsQuery, BoolQuery, WithFilterQueryBuilder, custom queries)
- OpenSearchDirectoryReader cache key management: New per-criteria cache isolation via DelegatingCacheHelper and DelegatingCacheKey constructors; ChildDirectoryReader lifecycle and cache helper override semantics
- Cross-layer API propagation: Verify criteria parameter flows correctly through Engine โ IndexShard โ SearchService and that null-handling is consistent
- Boolean query clause precedence: Ensure filter > must > should precedence is correctly implemented and tested in recursive boolean scenarios
- Script execution and field resolution: ContextAwareGroupingScript invocation, NumberFieldMapper handling, and script-to-string conversion in MockScriptEngine
Suggested labels
Indexing
Suggested reviewers
- msfroh
- cwperks
- reta
Poem
๐ฐ A tale of segments sorted right,
By context criteria shining bright,
Filters extract what queries yearn,
And readers learn where buckets turn,
Scripts transform each field's delight! โจ
Pre-merge checks and finishing touches
โ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | โ ๏ธ Warning | Docstring coverage is 25.26% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
โ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | โ Passed | The title 'Adds search implementation of context aware segments' clearly describes the main objective of the PR: implementing search-time functionality for context-aware segment pruning. |
| Description check | โ Passed | The PR description explains the implementation's purpose and notes a dependency on an indexing PR. However, the description lacks completion of required checklist items and provides minimal technical detail about changes. |
โจ Finishing touches
- [ ] ๐ Generate docstrings
๐งช Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Comment @coderabbitai help to get the list of available commands and usage tips.