DataflowTemplates
DataflowTemplates copied to clipboard
Find test flakiness when building the project.
Dear developers, When I built the project I got some failures from these tests:
1. com.google.cloud.teleport.splunk.SplunkIOTest#successfulSplunkIOMultiBatchParallelismTest 2. com.google.cloud.teleport.bigtable.CassandraKeyUtilsTest#testSimplePrimaryKeyOrder 3. com.google.cloud.teleport.splunk.SplunkEventWriterTest#successfulSplunkWriteSingleBatchTest 4. com.google.cloud.teleport.splunk.SplunkEventWriterTest#eventWriterInvalidURL 5. com.google.cloud.teleport.splunk.SplunkEventWriterTest#failedSplunkWriteSingleBatchTest 6. com.google.cloud.teleport.bigtable.CassandraRowMapperFnTest#testTimestampColumn 7. com.google.cloud.teleport.bigtable.CassandraRowMapperFnTest#testType4UUIDColumn 8. com.google.cloud.teleport.bigtable.CassandraRowMapperFnTest#testType1UUIDColumn 9. com.google.cloud.teleport.splunk.HttpEventPublisherTest#unrecognizedSelfSignedCertificateTest
but when I rebuilt the project these failures disappeared, are they flaky tests or is their flakiness is a normal phenomenon? The test reports of these tests are in the attached file: flaky test reports.txt Could you please have a look at this problem? Thanks a lot.
Hi! Thanks for sharing the report file!
I haven't personally hit flakes, but it looks like some internal metric is marking these as 5 - 10% flaky. Based on the file you shared, the two things I'm noticing are:
- Errors related to something not happening in X amount of time.
- Attempts to bind to already-bound ports.
Depending on what's causing the second, both might be performance related. I'll see if someone can look into this.
A couple of us have tried to reproduce it without almost no success. In my case, the longest-running test was about 5-6s, but the tests with a clear timeout are all at 20s.
For any test using org.mockserver.integration.ClientAndServer
(example), we can probably increase the configured timeout, though I think that this should be handled by someone who can reproduce the flakes. That will solve at least some of the issues. We'll likely want a more permanent solution if possible, though.
As a temporary solution, if these are causing issues for development, they can be skipped from the command line using something like:
mvn test -Dtest=\!SplunkIOTest
One issue I was able to kind of reproduce was the HttpEventPublisherTest
failure, though it was in a separate test. Someone is looking into a potential cause there.
Thanks for your response! It's really useful.
Thanks for reporting.
I have tried to run the mentioned classes and didn't notice any failure.
I'll assume they are fixed or improved and close this issue for now.
Please reopen if there are any new failures or details.