OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

IP field via MultrangeQuery fix #16200

Open mkhludnev opened this issue 1 year ago • 3 comments

Description

Combines a many concrete IPs and CIRR masks into set when querying IP field

Related Issues

Resolves #16200

Check List

  • [ ] Functionality includes testing.
  • [ ] API changes companion pull request created, if applicable.
  • [ ] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

mkhludnev avatar Oct 19 '24 10:10 mkhludnev

:x: Gradle check result for c9e2bd1df81107f050617576e3f5c429c9aa285c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Oct 19 '24 10:10 github-actions[bot]

:x: Gradle check result for c9e2bd1df81107f050617576e3f5c429c9aa285c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Oct 19 '24 17:10 github-actions[bot]

:x: Gradle check result for b6c3410ae73af7fa6d2f22c7306c3f9b62fdaf59: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Oct 19 '24 19:10 github-actions[bot]

it's an alt of #16202

mkhludnev avatar Oct 20 '24 07:10 mkhludnev

:x: Gradle check result for 6a11b54405b4e8b79bef0c6877bbab0c4a95faa2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 08 '24 15:11 github-actions[bot]

:x: Gradle check result for 26ff736eb9c401c768d385d776c7e40045728c9e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 11 '24 23:11 github-actions[bot]

:x: Gradle check result for 01db87572537eb23944de26eb6b6d4dde12f3337: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 13 '24 08:11 github-actions[bot]

:x: Gradle check result for 5849b968fca0d21efda6aed578ad905abba5aa94: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 19 '24 21:11 github-actions[bot]

:x: Gradle check result for d03b61803e440a9256172bc51d1467361d556eec: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 20 '24 09:11 github-actions[bot]

:x: Gradle check result for 7e1f8b43a92203b3f64ad939c9021da7a0c7e090: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 20 '24 10:11 github-actions[bot]

:x: Gradle check result for c429cf0f41cc1d814a8763170ebe89f76628b0dd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 20 '24 15:11 github-actions[bot]

Let's discuss edge cases, which are not obvious:

doc_values only field with mask/ values

In this case we can just create BooleanQuery { ssDvRangeQuery, ...}. It may lately fail with too many clauses, and it's reasonable. Note: it will be a way better with ssDvMultiRange

index & doc_values field with mask/ values

  • [x] a strawman approach: forget about dv and IndexOrDvQuery, and just create MultiRangePointQuery

  • ~~attempt to create IndexOrDvQuery(MultiRangePointQuery, BooleanQuery { ssDvRangeQuery, ...}) when number of BQ is less than MaxClauses limit.~~

The problem is that this boundary limit can't be decided on query parsing because sibling filter clauses may exceed MaxClauses limit.

mkhludnev avatar Nov 20 '24 15:11 mkhludnev

:x: Gradle check result for cfa3904e6379ac2826047534145bfe1f6217d41a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 20 '24 17:11 github-actions[bot]

:white_check_mark: Gradle check result for a4d65db99207a2d9a6afaaabd2523e2bbcda6a0a: SUCCESS

github-actions[bot] avatar Nov 20 '24 21:11 github-actions[bot]

Codecov Report

Attention: Patch coverage is 86.15385% with 9 lines in your changes missing coverage. Please review.

Project coverage is 72.15%. Comparing base (3b4fa0e) to head (f6e8303). Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
...ava/org/opensearch/index/mapper/IpFieldMapper.java 86.15% 4 Missing and 5 partials :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16391      +/-   ##
============================================
+ Coverage     72.11%   72.15%   +0.03%     
- Complexity    65192    65207      +15     
============================================
  Files          5318     5318              
  Lines        303903   303949      +46     
  Branches      43970    43985      +15     
============================================
+ Hits         219166   219304     +138     
+ Misses        66786    66667     -119     
- Partials      17951    17978      +27     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Nov 20 '24 21:11 codecov[bot]

:x: Gradle check result for 75b271987636ba87ad75045ed3d7890240523c61: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 21 '24 08:11 github-actions[bot]

Hi there! Kindly asking for review. And tagging as backport. Thanks!

mkhludnev avatar Nov 21 '24 19:11 mkhludnev

In this case we can just create BooleanQuery { ssDvRangeQuery, ...}. It may lately fail with too many clauses, and it's reasonable. Note: it will be a way better with ssDvMultiRange

Just a question, since with this change we still don't solve the problem completely (by and large, the limit might be hit anyway), is it worth the effort?

reta avatar Nov 22 '24 01:11 reta

is it worth the effort?

I think it is: it unblocks large terms_queries for indexed fields right now. Regarding doc_values solution: Lucene#13974, I'm not sure if I'm able to complete it and how long it take.

PS. Now, if terms_query has 2K IPs it works fine even for doc_values-only, but adding single mask flip it to too many clauses. This fix make it (2K IPs + less than 1K masks) work (even for doc_values-only).

mkhludnev avatar Nov 22 '24 06:11 mkhludnev

:white_check_mark: Gradle check result for e8e769fc7e7da7615c3cebc59f9f9196392b29d8: SUCCESS

github-actions[bot] avatar Nov 22 '24 07:11 github-actions[bot]

LGTM, thanks @mkhludnev ! @msfroh anything from your side?

reta avatar Nov 22 '24 13:11 reta

:white_check_mark: Gradle check result for 341e2566acbb9b0f43b975cc108af60e5de2b431: SUCCESS

github-actions[bot] avatar Nov 26 '24 14:11 github-actions[bot]

:x: Gradle check result for 7aa15dafdbae60b786380f63348b79a697e070bf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 26 '24 21:11 github-actions[bot]

:x: Gradle check result for 7aa15dafdbae60b786380f63348b79a697e070bf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Nov 26 '24 22:11 github-actions[bot]

:white_check_mark: Gradle check result for f6e8303208e1b86fe913b8cf751255e6c2058b94: SUCCESS

github-actions[bot] avatar Nov 27 '24 07:11 github-actions[bot]

@msfroh LGTY? thanks!

reta avatar Nov 27 '24 08:11 reta

Thank you so much, @reta @msfroh!

mkhludnev avatar Nov 28 '24 06:11 mkhludnev

Thank you so much, @reta @msfroh!

Thanks to you @mkhludnev !

reta avatar Nov 28 '24 06:11 reta