OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Adding access to noSubMatches and noOverlappingMatches in Hyphenation…

Open hasnain2808 opened this issue 1 year ago • 9 comments

Description

This change adds support for / exposes two new settings (noSubMatches and noOverlappingMatches) that were added to Lucene's HyphenationCompoundWordTokenFilter class.

Related Issues

Resolves https://github.com/opensearch-project/OpenSearch/issues/8796 Copy of https://github.com/opensearch-project/OpenSearch/pull/10765

Check List

  • [x] New functionality includes testing.
    • [x] All tests pass
  • [x] New functionality has been documented.
    • [x] New functionality has javadoc added
  • [x] API changes companion pull request created.
  • [ ] Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • [x] Commits are signed per the DCO using --signoff
  • [x] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • [x] Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

hasnain2808 avatar May 30 '24 13:05 hasnain2808

:x: Gradle check result for 5abc8ecde55bf35491caf586021b4a8657710650: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 30 '24 14:05 github-actions[bot]

:x: Gradle check result for 5abc8ecde55bf35491caf586021b4a8657710650: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 30 '24 14:05 github-actions[bot]

:x: Gradle check result for fe6fba7c02df5446c4e20dec8b7de1cc5485bf81: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 30 '24 14:05 github-actions[bot]

:x: Gradle check result for 3d5ffdc54ed9f5984c99ccfa446dd721f208cd5b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar May 30 '24 14:05 github-actions[bot]

:white_check_mark: Gradle check result for 6ebc3ecb6976b3e0de5ca672d60cefb71b16900c: SUCCESS

github-actions[bot] avatar May 30 '24 15:05 github-actions[bot]

Looks like the ci failure is unrelated to the change cc: @msfroh @jainankitk @getsaurabh02

hasnain2808 avatar May 30 '24 15:05 hasnain2808

Looks like the ci failure is unrelated to the change cc: @msfroh @jainankitk @getsaurabh02

When this happens, force push an update to re-trigger it. For flaky tests, examine the failure and link to existing bugs or open a new one. Please see https://github.com/opensearch-project/OpenSearch/blob/main/DEVELOPER_GUIDE.md#flaky-tests.

I re-triggered some of the CI in the meantime.

dblock avatar Jun 03 '24 12:06 dblock

Thanks @dblock ! I was off the grid, so couldn't force push Looks like it is all green now 😄

hasnain2808 avatar Jun 21 '24 10:06 hasnain2808

cc: @msfroh @jainankitk @getsaurabh02

hasnain2808 avatar Jun 21 '24 10:06 hasnain2808

@hasnain2808 - Thank you for creating this PR. Can you add example of how to specify this setting (I guess can only be done using yml)? Also, should we have some test for validating the values are read correctly?

Hi @jainankitk I will get to this soon

hasnain2808 avatar Jul 12 '24 16:07 hasnain2808

@hasnain2808 - Thank you for creating this PR. Can you add example of how to specify this setting (I guess can only be done using yml)? Also, should we have some test for validating the values are read correctly?

Hi @jainankitk I will get to this soon

Thanks @hasnain2808 for the update

jainankitk avatar Jul 16 '24 21:07 jainankitk

@hasnain2808 - Just checking if you still working on this

jainankitk avatar Aug 01 '24 17:08 jainankitk

@hasnain2808 - Just checking if you still working on this

Hi @jainankitk,

I am restarting on this

I was back from my extended leaves few weeks back but had to catchup on work. That's done now!

Thanks for waiting!

hasnain2808 avatar Aug 04 '24 07:08 hasnain2808

:grey_exclamation: Gradle check result for f986985b585def8737e2bf789aefdd6dde84ea96: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

github-actions[bot] avatar Aug 04 '24 08:08 github-actions[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 71.91%. Comparing base (3db2525) to head (8752b76). Report is 35 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13895      +/-   ##
============================================
+ Coverage     71.84%   71.91%   +0.06%     
- Complexity    63005    63030      +25     
============================================
  Files          5185     5185              
  Lines        295185   295187       +2     
  Branches      42664    42664              
============================================
+ Hits         212086   212283     +197     
+ Misses        65660    65460     -200     
- Partials      17439    17444       +5     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 04 '24 08:08 codecov[bot]

:x: Gradle check result for ea02134050cd10093dec14de3664cd2d27911be6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 07 '24 06:08 github-actions[bot]

:x: Gradle check result for 9d9c9a6c6373e8ddfaf5a314db4459048f91518f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 10 '24 21:08 github-actions[bot]

:x: Gradle check result for 622928e091edfdf95ff05c6859c02019d113c373: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 10 '24 21:08 github-actions[bot]

:x: Gradle check result for 452be84beeb059f97d4f9247c045410fe4ddc472: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 10 '24 21:08 github-actions[bot]

@jainankitk removed the 3.x changelog entry

hasnain2808 avatar Aug 10 '24 22:08 hasnain2808

:white_check_mark: Gradle check result for 0ef9e3f98fb29f721b4f21502a8435b6553c7671: SUCCESS

github-actions[bot] avatar Aug 10 '24 22:08 github-actions[bot]

:white_check_mark: Gradle check result for 37912e600e802e753652ed42d2119ced110572cd: SUCCESS

github-actions[bot] avatar Aug 10 '24 22:08 github-actions[bot]

:grey_exclamation: Gradle check result for a1ff3c58dabda803803199b9c18f36c0b5263d1e: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

github-actions[bot] avatar Aug 10 '24 23:08 github-actions[bot]

@jainankitk friendly reminder for a review 😄

hasnain2808 avatar Aug 13 '24 04:08 hasnain2808

:x: Gradle check result for 7b2142efe9ad0f5f51324614e677af6ab9fc4caf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 13 '24 21:08 github-actions[bot]

@hasnain2808 - It seems the spotless check is failing. Can you fix those?

Execution failed for task ':modules:analysis-common:spotlessJavaCheck'.
> The following files had format violations:
      src/test/java/org/opensearch/analysis/common/CompoundAnalysisTests.java
          @@ -35,7 +35,6 @@
           import·org.apache.lucene.analysis.Analyzer;
           import·org.apache.lucene.analysis.TokenStream;
           import·org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
          -import·org.junit.Before;
           import·org.opensearch.Version;
           import·org.opensearch.cluster.metadata.IndexMetadata;
           import·org.opensearch.common.settings.Settings;
          @@ -51,6 +50,7 @@
           import·org.opensearch.test.IndexSettingsModule;
           import·org.opensearch.test.OpenSearchTestCase;
           import·org.hamcrest.MatcherAssert;
          +import·org.junit.Before;
           
           import·java.io.IOException;
           import·java.io.InputStream;
  Run './gradlew :modules:analysis-common:spotlessApply' to fix these violations.

jainankitk avatar Aug 13 '24 21:08 jainankitk

@hasnain2808 - It seems the spotless check is failing. Can you fix those?

Execution failed for task ':modules:analysis-common:spotlessJavaCheck'.
> The following files had format violations:
      src/test/java/org/opensearch/analysis/common/CompoundAnalysisTests.java
          @@ -35,7 +35,6 @@
           import·org.apache.lucene.analysis.Analyzer;
           import·org.apache.lucene.analysis.TokenStream;
           import·org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
          -import·org.junit.Before;
           import·org.opensearch.Version;
           import·org.opensearch.cluster.metadata.IndexMetadata;
           import·org.opensearch.common.settings.Settings;
          @@ -51,6 +50,7 @@
           import·org.opensearch.test.IndexSettingsModule;
           import·org.opensearch.test.OpenSearchTestCase;
           import·org.hamcrest.MatcherAssert;
          +import·org.junit.Before;
           
           import·java.io.IOException;
           import·java.io.InputStream;
  Run './gradlew :modules:analysis-common:spotlessApply' to fix these violations.

Done Weird this error was missed

hasnain2808 avatar Aug 13 '24 21:08 hasnain2808

:x: Gradle check result for 6a88bb0bf473861acc9ef57a47889f5590a51eb2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 13 '24 22:08 github-actions[bot]

:x: Gradle check result for 8752b766cfbe9ac1e4588f4daa190a4a91e122cc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] avatar Aug 13 '24 22:08 github-actions[bot]

@msfroh @mch2 - Can one of you help merge this change?

@msfroh @mch2 could you please have a look at this mini pr 🙂

hasnain2808 avatar Aug 19 '24 18:08 hasnain2808