OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[DRAFT] POC for customized and rule-based labeling for search queries

Open ansjcy opened this issue 1 year ago • 1 comments

Description

A simple POC for customized and rule-based labeling for search queries

  • Added the ability to send customized labels in a search query.
  • Added a bare-minimum rule-based labeling service to attach default user related information from security plugin
  • Use top n queries service in query insights plugin to read those tags, to prove the concept

A simple demo test of the changes:

Steps

  • Preparation: First spin up a test cluster, enable query insights plugin, and index one document.
curl -X PUT 'localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d'
{
    "persistent" : {
        "search.insights.top_queries.latency.enabled" : "true",
        "search.insights.top_queries.latency.window_size" : "1m",
        "search.insights.top_queries.latency.top_n_size" : 20
    }
}'

curl -X POST "localhost:9200/my-index-0/_doc/?pretty" -H 'Content-Type: application/json' -d'
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "this is document 1",
  "user": {
    "id": "cyji"
  }
}'
  • Do search query on the index with a customized label
curl -X GET "localhost:9200/my-index-*/_search?size=1000&pretty" -H 'Content-Type: application/json' -d '{
  "query" : {
    "term": {
      "user.id": "chenyang"
    }
  },
  "labels": {
    "my_tenant_tag": "customized-tenant-id"
  }
}'
  • Then query top n queries API, validate the customized label my_tenant_tag and default labels injected by the default rules are there in the top n queries results
curl -X GET "localhost:9200/_insights/top_queries?pretty"
{
  "top_queries" : [
    {
      "timestamp" : 1714010081376,
      "user_name" : "chenyang",  // <--------- default label 
      "phase_latency_map" : {
        "expand" : 0,
        "query" : 26,
        "fetch" : 1
      },
      "search_type" : "query_then_fetch",
      "source" : "{\"size\":1000,\"query\":{\"term\":{\"user.id\":{\"value\":\"cyji\",\"boost\":1.0}}},\"labels\":{\"my_tenant_tag\":\"customized-tenant-id\",\"user_tenant\":\"t1\",\"user_backend_roles\":[\"br1\",\"b2\"],\"user_name\":\"chenyang\",\"remote_address\":\"1.2.3.4\",\"user_roles\":[\"r1\",\"r2\"]}}",
      "customized_tag" : "customized-tenant-id",  // <--------- User customized label
      "total_shards" : 1,
      "indices" : [
        "my-index-*"
      ],
      "node_id" : "ZeIqCaZtQ3GPDdRwLuXorg",
      "latency" : 38
    }
  ]
}

Detailed Results

res

Related Issues

https://github.com/opensearch-project/OpenSearch/issues/13341

Check List

  • [ ] New functionality includes testing.
    • [ ] All tests pass
  • [ ] New functionality has been documented.
    • [ ] New functionality has javadoc added
  • [ ] Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • [ ] Commits are signed per the DCO using --signoff
  • [ ] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • [ ] Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

ansjcy avatar Apr 25 '24 02:04 ansjcy

:x: Gradle check result for 62b17bfc6c520656fda082fccce2835de0b5de77: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Apr 25 '24 02:04 github-actions[bot]

:x: Gradle check result for c24fa4d41c061ee1be406dfb45f36764067790d0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 21 '24 00:05 github-actions[bot]

:x: Gradle check result for 521ed875ad8c1a341832f48ed7b663c36c1c61c3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 21 '24 00:05 github-actions[bot]

:x: Gradle check result for efde97662c920638fc79f0de3226d657fd47f577: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 21 '24 00:05 github-actions[bot]

@ansjcy - Thank you describing the detailed solution as part of this PR. As per my understanding, the customer/client should not even be supplying the label. Instead based on some rules, the label is/are automatically appended on top of original request.

jainankitk avatar May 29 '24 00:05 jainankitk

Hey @jainankitk !

As per my understanding, the customer/client should not even be supplying the label

Actually this PR implements the 2 solutions mentioned in @msfroh 's RFC: https://github.com/opensearch-project/OpenSearch/issues/13341

We want to provide default rules and potential future customized rules to add labels from the thread context , search query itself .. (Rule-based labeling) , also add the ability to attach customized labels (Let the client do it) .

ansjcy avatar May 29 '24 23:05 ansjcy

:x: Gradle check result for ff0eb398ec7b68b8d150c5283a92f80b8cef33c4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 04 '24 21:06 github-actions[bot]

:x: Gradle check result for 1646f4294b7667c8ce15e9b60d3722ac27e99f85: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 05 '24 00:06 github-actions[bot]

:white_check_mark: Gradle check result for a59885b970a31a006aec9bd1e6dbc2ec54791dba: SUCCESS

github-actions[bot] avatar Jun 05 '24 22:06 github-actions[bot]

Codecov Report

Attention: Patch coverage is 54.54545% with 5 lines in your changes missing coverage. Please review.

Project coverage is 71.79%. Comparing base (b15cb0c) to head (9e5b621). Report is 365 commits behind head on main.

Files Patch % Lines
...action/search/SearchRequestOperationsListener.java 0.00% 3 Missing :warning:
.../insights/core/listener/QueryInsightsListener.java 80.00% 0 Missing and 1 partial :warning:
...opensearch/action/search/SearchRequestContext.java 0.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13374      +/-   ##
============================================
+ Coverage     71.42%   71.79%   +0.37%     
- Complexity    59978    61650    +1672     
============================================
  Files          4985     5081      +96     
  Lines        282275   289069    +6794     
  Branches      40946    41836     +890     
============================================
+ Hits         201603   207525    +5922     
- Misses        63999    64452     +453     
- Partials      16673    17092     +419     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 05 '24 22:06 codecov[bot]

:x: Gradle check result for 1753795b237472f55718130bf9eab6f9c5ced5d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 05 '24 22:06 github-actions[bot]

:x: Gradle check result for 3321187cf204542464e77c87648632159ef41049: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 02:06 github-actions[bot]

:x: Gradle check result for 8f22ac3242b9af1d760a3f60629956ef0db0a43b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 05:06 github-actions[bot]

:x: Gradle check result for c135c5cac2c1652d4a19874c31180ff2ecfc8696: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 09:06 github-actions[bot]

:x: Gradle check result for 3a580b8a55b63a4fabd9c62fb6c268caeeb557d4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 19:06 github-actions[bot]

:x: Gradle check result for 7907e7cf391c493eb9d5f9a44e4078b5d600645c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 20:06 github-actions[bot]

:x: Gradle check result for 46ff1e165692077ae1776639fd2fa35b1dbd877d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 20:06 github-actions[bot]

:x: Gradle check result for 705729697a65834cf4ed14d503e867e3db49c178: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 21:06 github-actions[bot]

:x: Gradle check result for 29cc614e72fba5dd5a81f8eb2966f91930d07f90: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Jun 06 '24 22:06 github-actions[bot]

:grey_exclamation: Gradle check result for 3d982e8dceda3f07a074d8b442caef51d2c67e3d: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

github-actions[bot] avatar Jun 06 '24 23:06 github-actions[bot]

:grey_exclamation: Gradle check result for 9e5b6214fbc480749c6fb85f5483169e6cc2e8ec: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

github-actions[bot] avatar Jun 07 '24 00:06 github-actions[bot]

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-13374-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9d3cf43e7f06d2011c931cf829439e9fba97f18d
# Push it to GitHub
git push --set-upstream origin backport/backport-13374-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-13374-to-2.x.