OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[Workload Management][Rule Based Autotagging] Scroll API support in autotagging

Open laminelam opened this issue 1 month ago • 3 comments

Resolves #8362

Description

[Describe what this change achieves]

Check List

  • [x] Functionality includes testing.
  • [x] API changes companion pull request created, if applicable.
  • [ ] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

Release Notes

  • New Features

    • Scroll requests are now properly tagged with workload management rules for improved tracking and monitoring.
  • Improvements

    • Enhanced scroll request handling with better preservation of index metadata information throughout the request lifecycle.

✏️ Tip: You can customize this high-level summary in your review settings.

laminelam avatar Dec 02 '25 13:12 laminelam

Walkthrough

The changes enable scroll requests to be tagged by the workload management system by capturing and propagating original request indices through scroll IDs. Index resolution support is added to TransportSearchScrollAction, allowing it to extract indices from parsed scroll IDs for auto-tagging workflows.

Changes

Cohort / File(s) Summary
Scroll ID Index Persistence
server/src/main/java/org/opensearch/action/search/ParsedScrollId.java, server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java, server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java
Extends scroll ID to capture and preserve original request indices; adds serialization/deserialization with version gate (V_3_3_2), new field originalIndices, and updated buildScrollId logic to include indices when appropriate.
Scroll Request Caching & Resolution
server/src/main/java/org/opensearch/action/search/SearchScrollRequest.java, server/src/main/java/org/opensearch/action/search/TransportSearchScrollAction.java
Implements lazy caching of parsed scroll ID in SearchScrollRequest; makes TransportSearchScrollAction implement TransportIndicesResolvingAction to resolve indices from parsed scroll ID for workload management integration.
Workload Management Tagging
plugins/workload-management/src/main/java/org/opensearch/plugin/wlm/AutoTaggingActionFilter.java, plugins/workload-management/src/test/java/org/opensearch/plugin/wlm/AutoTaggingActionFilterTests.java
Extends AutoTaggingActionFilter to recognize SearchScrollRequest and extract index patterns from resolved indices; adds test coverage for scroll request tagging flow.
Workload Management Integration Tests
plugins/workload-management/src/internalClusterTest/java/org/opensearch/plugin/wlm/WlmAutoTaggingIT.java
Adds test method testScrollRequestsAreAlsoTagged() validating that both initial and subsequent scroll requests trigger workload tagging (note: method duplicated in patch).
Test Updates
server/src/test/java/org/opensearch/action/search/ParsedScrollIdTests.java, server/src/test/java/org/opensearch/action/search/SearchScrollAsyncActionTests.java
Updates ParsedScrollId constructor calls to pass new String[] originalIndices parameter.

Sequence Diagram

sequenceDiagram
    participant Client
    participant TSA as TransportSearchScrollAction
    participant PSID as ParsedScrollId
    participant ATAF as AutoTaggingActionFilter
    participant WLS as RuleProcessingService

    Client->>TSA: SearchScrollRequest (scrollId)
    activate TSA
    
    TSA->>PSID: parseScrollId()
    activate PSID
    PSID-->>TSA: ParsedScrollId (with originalIndices)
    deactivate PSID
    
    TSA->>TSA: resolveIndices(SearchScrollRequest)
    Note over TSA: Extract originalIndices from ParsedScrollId<br/>Build OptionallyResolvedIndices
    TSA->>TSA: Store in request metadata
    
    TSA->>ATAF: Filter request
    activate ATAF
    
    ATAF->>ATAF: Recognize SearchScrollRequest
    ATAF->>ATAF: Extract ResolvedIndices from metadata
    ATAF->>ATAF: Create AttributeExtractor with INDEX_PATTERN
    
    ATAF->>WLS: evaluateLabel(RuleAttribute...)
    activate WLS
    WLS-->>ATAF: Workload group label
    deactivate WLS
    
    ATAF-->>Client: Tagged response
    deactivate ATAF
    deactivate TSA

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Serialization/deserialization changes: TransportSearchHelper now serializes/deserializes originalIndices with version gating; review backward compatibility and edge cases where originalIndices may be null or empty.
  • Interface implementation: TransportSearchScrollAction now implements TransportIndicesResolvingAction; verify resolveIndices() logic correctly extracts and wraps indices from parsed scroll ID with proper error handling and fallback to unknown().
  • Lazy caching pattern: SearchScrollRequest caches ParsedScrollId; confirm thread safety and nullability handling are appropriate.
  • Duplicate test method: WlmAutoTaggingIT contains an identical duplicate of testScrollRequestsAreAlsoTagged(); determine if intentional or should be removed.

Poem

🐰 Scroll requests now dance with tags in tow,
Indices preserved through the ID's flow,
From search to scroll, the rabbit's delight,
Workload groups tagged with perfect sight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete. It includes only a broken issue link and basic checklist items but lacks the required 'Description' section explaining what the change achieves. Add a detailed description explaining the Scroll API support implementation, changes made to autotagging logic, and how scroll requests are now handled alongside regular search requests.
Linked Issues check ⚠️ Warning The linked issue #8362 concerns HDFS fixture BouncyCastle exclusion, but the PR implements Scroll API support for workload management autotagging—completely unrelated objectives. Link the correct issue (likely related to scroll API autotagging support) that reflects the actual changes in this PR, not an unrelated HDFS fixture backport.
Docstring Coverage ⚠️ Warning Docstring coverage is 3.70% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely describes the main change: adding Scroll API support to workload management autotagging, which aligns with the changeset modifications.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing Scroll API support in autotagging across workload management, filter logic, and search action components, with no out-of-scope modifications detected.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Dec 02 '25 13:12 coderabbitai[bot]

:x: Gradle check result for 2cc9bb70c5aea4a37b3cd52b34cf87370714c483: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Dec 02 '25 16:12 github-actions[bot]

@kaushalmahi12 When you have time, would you please take a look into this? Thx

laminelam avatar Dec 05 '25 17:12 laminelam