OpenSearch
OpenSearch copied to clipboard
Tika tests
Description
Enhance tika document parsing tests by validating output against current version.
Related Issues
Resolves "Improve the validation on TikaDocTests #12887"
Check List
- [x] New functionality includes testing.
- [x] All tests pass
- [ ] New functionality has been documented.
- [ ] New functionality has javadoc added
- [ ] Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
- [ ] Commits are signed per the DCO using --signoff
- [ ] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
- [ ] Public documentation issue/PR created
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.
:x: Gradle check result for 7ee60451f19403c5b80b706b86dcdb38c7bfec31: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 810f3a90f53f83fa0eadeebfe053fb083116e90a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Gradle check failing due to unrelated flaky test: #11979
:x: Gradle check result for 810f3a90f53f83fa0eadeebfe053fb083116e90a: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 3bd9469706a733fb51fbec64ee3912c7d5af4884: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 2dc3fcf87fd089a6e893f5ff4e413ca8488707fe: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ddd4b5659f868aa4cf5852af47cb8d1061a217bc: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Known flaky test: #10006
:x: Gradle check result for ddd4b5659f868aa4cf5852af47cb8d1061a217bc: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Known flaky test: #13476
- Is there any advantage of having those zipped in the repo other than having to unzip them?
- Keeping a checksum map in code feels a little odd. I don't feel strongly about it, but maybe a .checksum file would be a bit cleaner.
:x: Gradle check result for 7e3ec2d4d0594d11ff89253851db5c0b5dece99d: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Hi @dblock and @reta thanks for taking a look! I've moved the checksum map to a separate file.
- Is there any advantage of having those zipped in the repo other than having to unzip them?
I believe the intent here is to hide from the linter. testEXCEL.xls for example is not UTF-8 and fails the precommit "forbiddenPatterns" task.
:x: Gradle check result for 3fcc4bcc3fe0a091d777d37dda557273742b6cb2: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Known flaky test: #13600
:x: Gradle check result for 3fcc4bcc3fe0a091d777d37dda557273742b6cb2: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for 9ae651e74143c679dff57a610e9ef7ec802647cb: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
:x: Gradle check result for ef62853eb0021108b4ad043d3f704e487c26e0fd: FAILURE
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?
Needs https://github.com/opensearch-project/OpenSearch/pull/13673
@finnegancarroll we sadly have pretty flaky test suite now, fe this combination fails for me:
./gradlew ':plugins:ingest-attachment:test' --tests "org.opensearch.ingest.attachment.TikaDocTests.testParseSamples" -Dtests.seed=98D53194946B5C85 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=hi-IN -Dtests.timezone=Asia/Istanbul
Please let ./gradlew :plugins:ingest-attachment:check
run for a couple of hours, to make sure the test suite is stable, thank you.
:white_check_mark: Gradle check result for f0cc85434c55fd854a867177e2c6f957b6137ca5: SUCCESS
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 71.56%. Comparing base (
b15cb0c
) to head (f0cc854
). Report is 286 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #13618 +/- ##
============================================
+ Coverage 71.42% 71.56% +0.14%
- Complexity 59978 61201 +1223
============================================
Files 4985 5059 +74
Lines 282275 287522 +5247
Branches 40946 41646 +700
============================================
+ Hits 201603 205759 +4156
- Misses 63999 64777 +778
- Partials 16673 16986 +313
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Removed strict checksum validation for some additional files with locale dependent parsing. Ran for a couple hours and with all available locales in Locale.getAvailableLocales()
to ensure no flaky cases remain.
This looks better than what we have, @reta any objections?
It really does, no objections @dblock , just double checking no flakyness is going to be introduced
This looks better than what we have, @reta any objections?
It really does, no objections @dblock , just double checking no flakyness is going to be introduced
Thanks. All yours to merge.