OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[AUTOCUT] Gradle Check Flaky Test Report for MinDocCountIT

Open opensearch-ci-bot opened this issue 1 year ago • 4 comments

Flaky Test Report for MinDocCountIT

Noticed the MinDocCountIT has some flaky, failing tests that failed during post-merge actions.

Details

Git Reference Merged Pull Request Build Details Test Name
42d6af66a461900ffe48252e099738e7152f727c 14123 40485 org.opensearch.search.aggregations.bucket.MinDocCountIT.testDoubleCountDesc {p0={"search.concurrent_segment_search.enabled":"true"}}
c639e9a42e4bf20ee05d0511ad8bfc8af9ecc0f9 14090 40463 org.opensearch.search.aggregations.bucket.MinDocCountIT.testHistogramKeyDesc {p0={"search.concurrent_segment_search.enabled":"true"}}

The other pull requests, besides those involved in post-merge actions, that contain failing tests with the MinDocCountIT class are:

For more details on the failed tests refer to OpenSearch Gradle Check Metrics dashboard.

opensearch-ci-bot avatar Jun 13 '24 21:06 opensearch-ci-bot

Looks like the MinDocCountIT.testDoubleCountDesc failure above is due to concurrent thread context modification bug for which fix has been merged: https://github.com/opensearch-project/OpenSearch/pull/14084

jed326 avatar Jun 19 '24 17:06 jed326

A quick scan through the other not post-merge related failures indicates that most of the other MinDocCountIT are due to the same issue fixed in https://github.com/opensearch-project/OpenSearch/pull/14084.

jed326 avatar Jun 19 '24 17:06 jed326

The failure for MinDocCountIT.testHistogramKeyDesc looks like it's finding an incorrect doc count for a given bucket key when minDocCount is used.

It's possible that this is due to concurrent search and minDocCount not being satisfied on one of the shards but it's also possible that it's due to one of the histogram rewrite changes. Unfortunately was not able to reproduce locally with the test seed.

jed326 avatar Jun 19 '24 18:06 jed326

My hunch was that this could be similar to the issue in https://github.com/opensearch-project/OpenSearch/pull/9085 for terms aggregations, however based on https://github.com/opensearch-project/OpenSearch/blob/e3542011cd9584c354405ba6cbf9a31ced7bc5ce/server/src/main/java/org/opensearch/search/aggregations/bucket/histogram/InternalHistogram.java#L335-L337 it looks like minDocCount is only applied during the coordinator reduce which makes it unlikely that the this is due to concurrent search as the shard search requests should be identical in both cases.

Will defer to @bowenlan-amzn for any insights related to the recent histogram optimizations.

jed326 avatar Jun 19 '24 18:06 jed326

Closing this as this might have been fixed in 2.13

sandeshkr419 avatar Dec 11 '24 17:12 sandeshkr419