druid icon indicating copy to clipboard operation
druid copied to clipboard

org.apache.druid.indexing.common.task.batch.parallel.PartialCompactionTest is flaky

Open abhishekagarwal87 opened this issue 2 years ago • 1 comments

org.apache.druid.indexing.common.task.batch.parallel.PartialCompactionTest seems to be stalling the build - see this run here for example https://app.travis-ci.com/github/apache/druid/jobs/575725134

abhishekagarwal87 avatar Jul 06 '22 10:07 abhishekagarwal87

+1, I have noticed this one too. It can also be seen in the logs of #12747

kfaraz avatar Jul 11 '22 05:07 kfaraz

Failed in this build, which is an ARM build:


[ERROR] Tests run: 5, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 201.022 s <<< FAILURE! - in org.apache.druid.indexing.common.task.batch.parallel.PartialCompactionTest
[ERROR] org.apache.druid.indexing.common.task.batch.parallel.PartialCompactionTest.testPartialCompactHashAndDynamicPartitionedSegments  Time elapsed: 89.626 s  <<< FAILURE!
java.lang.AssertionError: Actual task status: TaskStatus{id=compact_dataSource_fjadibcc_2022-11-09T05:26:19.500Z, status=FAILED, duration=-1, errorMsg=Ran [11] specs, [8] succeeded, [3] failed} expected:<SUCCESS> but was:<FAILED>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:120)
	at org.apache.druid.indexing.common.task.batch.parallel.AbstractMultiPhaseParallelIndexingTest.runTaskAndVerifyStatus(AbstractMultiPhaseParallelIndexingTest.java:179)
	at org.apache.druid.indexing.common.task.batch.parallel.AbstractMultiPhaseParallelIndexingTest.runTask(AbstractMultiPhaseParallelIndexingTest.java:184)
	at org.apache.druid.indexing.common.task.batch.parallel.PartialCompactionTest.testPartialCompactHashAndDynamicPartitionedSegments(PartialCompactionTest.java:141)

I wonder if the test is written poorly: there are many dozens of failures. Do later steps require proper completion of earlier steps? This is problematic as test run order should not be considered deterministic. Also, a failure early on causes cascading failures. If the tests are independent, then something has gone very wrong to have dozens of failures.

paul-rogers avatar Nov 09 '22 20:11 paul-rogers

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jan 07 '24 00:01 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

github-actions[bot] avatar Feb 05 '24 00:02 github-actions[bot]