vertx-mqtt icon indicating copy to clipboard operation
vertx-mqtt copied to clipboard

Fix flaky connackNotOk test by explicitly closing server

Open williamlin0825 opened this issue 1 year ago • 3 comments

Fix OD flakiness by closing Vert.x MQTT server after connackNotOk test in MqttClientConnectIT

This PR closes the MQTT server after the connackNotOk test to prevent it from remaining active. Previously, the server was still running after the test finished, causing other tests (e.g., testValidClientIdentifier in MqttServerClientIdentifierTest) to fail when they couldn't start a new server instance. This OD flakiness was detected by iDFlakies.

{
    "dts": [
        {
            "name": "io.vertx.mqtt.test.server.MqttServerClientIdentifierTest.testValidClientIdentifier",
            "intended": {
                "order": [
                    "io.vertx.mqtt.it.MqttClientConnectIT.connackNotOk"
                ],
                "result": "ERROR",
                "testRunId": "1730911774733-f1313efd-d311-41b9-b99e-e99db69d4f7c"
            },
            "revealed": {
                "order": [],
                "result": "PASS",
                "testRunId": "1730911782385-e91a008c-d3a6-49b5-bcdc-8136b7868a3f"
            },
            "type": "OD"
        }
    ]
}

williamlin0825 avatar Nov 06 '24 16:11 williamlin0825

Thanks for starting this PR @williamlin0825

I haven't seen a failing job in CI, can you tell us more about why this test in particular fails intermittently.

I'm not familiar with vertx-mqtt, but looking at the test class, it seems the Vertx and MqttClient objects should be stored in test class fields. Then, they could be properly closed in a tearDown method.

tsegismont avatar Nov 07 '24 14:11 tsegismont

@tsegismont iDFlakies runs a bunch of tests with arbitrary sequences, and I found that many tests that require the MQTT server to start before them show a connection failure after this test runs.

The other tests in the MqttClientConnectIT class don't create a server instance themselves but rely on the existing container setup. Also, there is no tearDown method in the MqttClientBaseIT class, which is the base class of MqttClientConnectIT. Therefore, the bug occurs when the connackNotOk test starts the MQTT server but doesn't close it explicitly.

williamlin0825 avatar Nov 07 '24 17:11 williamlin0825

I still don't understand why it doesn't fail in the CI environment.

Anyway, I think closing the server and Vert.x objects at the end of the test is not a good practice. Instead, it should be closed in a tearDown. Can you take care of this?

tsegismont avatar Nov 12 '24 14:11 tsegismont