nifi icon indicating copy to clipboard operation
nifi copied to clipboard

NIFI-10623 fix flaky tests in TestHttpClient

Open Kerr0220 opened this issue 2 years ago • 2 comments

Summary

NIFI-10623

These flaky tests were caused by the feature of Map since the order of iterating map by enhanced for loop is non-deterministic. Therefore, checksums calculated based on the stream were also non-deterministic and thus failed when using NonDex to test.

After changing the order to deterministic by first sorting keys, and adjusting the values of checksums in test cases, these tests are no longer flaky.

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • [x] Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • [x] Pull Request based on current revision of the main branch
  • [x] Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • [ ] Build completed using mvn clean install -P contrib-check
    • [x] JDK 8
    • [ ] JDK 11
    • [ ] JDK 17

Licensing

  • [ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • [ ] New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • [ ] Documentation formatting appears as expected in rendered files

Kerr0220 avatar Oct 11 '22 16:10 Kerr0220

Sorry about that, but let me explain it. This pr is related to #6507 which would ensure that this one can pass all checks since they both work on calculating checksum in deterministic(one from sending side, the other from receiving side). Unfortunately, #6507 may not be approved now and thus this one may cause some tests to fail. I'm currently working on further fixing #6507 with other approaches and if #6507 can be accepted, this one should pass tests. And I will mention to you in the future if #6507 can be accepted. Thank you for your time again!

Kerr0220 avatar Oct 13 '22 00:10 Kerr0220

Thanks for the reply and evaluating the options @Kerr0220. If it makes sense to combine the changes in this pull request with the changes in the other pull request so that everything passes, feel free to merge the changes so they can be reviewed together. Alternatively, if you would like to keep them separate, we can come back to this pull request and rebase it once the other changes are finalized and merged.

exceptionfactory avatar Oct 13 '22 21:10 exceptionfactory

@Kerr0220 I am closing this PR for now, pending updates to the linked PR. Feel free to rebase this PR and reopen it once the changes in the other PR have been completed and merged.

exceptionfactory avatar Oct 27 '22 15:10 exceptionfactory

@exceptionfactory Hi, could you please reopen this PR? I've pushed a new commit to this PR.

Kerr0220 avatar Dec 09 '22 22:12 Kerr0220

@exceptionfactory Could you please help me figure out the reason for the cancellation of ci-workflow / Ubuntu Zulu JDK 17 EN (pull_request)? Other checks can pass now except this one. :)

Kerr0220 avatar Dec 10 '22 17:12 Kerr0220

Thanks for the contribution @Kerr0220.

The TestHttpClient class does not fail under normal execution, so any changes must ensure consistent test behavior. These changes break other tests that depend on the current behavior as shown in the automated build results:

Error:  org.apache.nifi.remote.protocol.http.TestHttpFlowFileServerProtocol.testTransferOneFile  Time elapsed: 0.002 s  <<< ERROR!
java.io.IOException: StandardHttpFlowFileServerProtocol[CommsID=testTransferOneFile] Sent data to peer Peer[url=http://peer-host:8080/] but calculated CRC32 Checksum as 194624838 while peer calculated CRC32 Checksum as 3229577812; canceling transaction and rolling back session
	at org.apache.nifi.remote.protocol.AbstractFlowFileServerProtocol.commitTransferTransaction(AbstractFlowFileServerProtocol.java:345)
	at org.apache.nifi.remote.protocol.http.StandardHttpFlowFileServerProtocol.commitTransferTransaction(StandardHttpFlowFileServerProtocol.java:203)
	at org.apache.nifi.remote.protocol.http.TestHttpFlowFileServerProtocol.testTransferOneFile(TestHttpFlowFileServerProtocol.java:312)

Please follow the pull request checklist and run a full build using mvn clean install to ensure that all tests pass.

Although there may be value in making adjustments to use LinkedHashMap and ordering in several cases, the impact is more broad than currently scoped.

This problem has been fixed now. And all 5 checks can pass. Please have a look.

Kerr0220 avatar Dec 10 '22 18:12 Kerr0220

@exceptionfactory That's a good idea. I will have a try!

Kerr0220 avatar Dec 14 '22 23:12 Kerr0220

@Kerr0220 if you are still working on this issue, please rebase against the current main branch. Changes in PR #6748 included updates to MockFlowFile that are applicable to this test class.

exceptionfactory avatar Jan 25 '23 22:01 exceptionfactory

@Kerr0220 As mentioned recently, changes from another PR should have improved and impacted behavior for this particular pull request. At this point, it would be best to open a new pull request when you are ready for further review.

exceptionfactory avatar Feb 06 '23 18:02 exceptionfactory