swift-nio-ssl
swift-nio-ssl copied to clipboard
flaky test: NIOSSLIntegrationTest.testNewCallbackCanDelayHandshake
we were stuck for 17 minutes in NIOSSLIntegrationTest.testNewCallbackCanDelayHandshake:
21:56:36 Test Case 'NIOSSLIntegrationTest.testNewCallbackCanDelayHandshake' started at 2020-03-03 21:56:36.961
22:13:09 Build timed out (after 20 minutes). Marking the build as failed.
(https://ci.swiftserver.group/job/swift-nio-ssl2-swift50-prb/196/console)
Thoughts on using an XCTestExpectation for this to make sure if it hangs it results in a failure?
Something like:
let completionExpectation = XCTestExpectation(description: "CompletionPromiseExpectation")
completionPromiseFiredLock.withLock {
XCTAssertFalse(completionPromiseFired)
completionExpectation.fulfill()
}
// Ok, allow the handshake to run.
handshakeCompletePromise!.succeed(.certificateVerified)
let newBuffer = try completionPromise.futureResult.wait()
XCTAssertTrue(completionPromiseFired)
XCTAssertEqual(newBuffer, originalBuffer)
wait(for: [completionExpectation], timeout: 5.0)
That works, but doesn't really address the flakiness of the test. In practice I mostly don't mind it if a hang results in failure, because in either case we really do have to fix this issue.
Yeah, I think hangs are a good enough signal too. There's no way we could catch all things that could possibly hang so if we were to start using XCTestExpectations, hanging tests now either hang or they fail the expectations, I'd rather have them just hang because it's easier to code and we have one well-defined failure mode.
Sure, that all makes sense. I will sideline this and keep an eye out for anything I see that may be the root cause of this issue. Thanks.
Oof. And again. https://ci.swiftserver.group/job/swift-nio-ssl2-swift52-prb/61/console
08:57:08 Test Case 'NIOSSLIntegrationTest.testNewCallbackCanDelayHandshake' started at 2020-10-19 07:57:08.018 09:25:41 Build timed out (after 30 minutes). Marking the build as failed.
hit again in CI in #293
Hit again in CI in #295.
Hit again in CI on nightly in #299.
Actually hit repeatedly there, a few times in 5.0 as well. Seems like this is manifesting more under load. I've done some previous glancing at this in the past and was never able to diagnose anything, seems like I may need to do a more thorough investigation.
Hit again on CI in nightly on #300.
Hit again in https://github.com/apple/swift-nio-ssl/pull/333 on 5.3
And another one on 5.3 in #339: https://ci.swiftserver.group/job/swift-nio-ssl2-swift53-prb/237/console
Hit on 5.6 in https://github.com/apple/swift-nio-ssl/pull/365