activemq icon indicating copy to clipboard operation
activemq copied to clipboard

#AMQ-8525 Improve performance of mqtt module

Open jeanouii opened this issue 1 month ago • 13 comments

jeanouii avatar Nov 19 '25 09:11 jeanouii

Some stats:

activemq-mqtt build times on MBP M3 14:13 w/o parallel 05:56 w/ parallel

mattrpav avatar Nov 20 '25 18:11 mattrpav

Commit message is misleading as it only focuses on improving the test execution time not the MQTT code itself.

tabish121 avatar Nov 20 '25 18:11 tabish121

So one big problem here is that this is going to couple things to Junit 4.

Now that Junit 6 is out we should be looking to migrate towards that otherwise all of this may have to be redone or thrown away so I'm not sure it makes sense to do any of this work until we migrate to an upgraded Junit.

cshannon avatar Nov 21 '25 15:11 cshannon

I think we could tackle JUnit 6 migration one module at a time as well. That way we don't need one giant PR (esp for activemq-unit-tests module).

mattrpav avatar Nov 21 '25 16:11 mattrpav

We still have old style JUnit 3 tests. I fully agree we should modernize the tests, a lot need to be rewritten as well and some might be removed all together. But this is another work in my opinion. And having this one get in would make rework easier because we get faster builds. So I see this first step as an enabler

Locally, I can observe the same as @mattrpav

activemq-mqtt takes

  • 14:23 without
  • 6:34 with

jeanouii avatar Nov 21 '25 18:11 jeanouii

The issue here is all of the changes are going to make it even harder to migrate which is counter productive when we need to migrate exactly because we are still using Junit 3/4 which is very much out of date.

cshannon avatar Nov 21 '25 18:11 cshannon

So I think to follow up the real question is just what happens when we convert tests? is there a path forward or does all of this have to be rewritten again? Some stuff is easy to migrate in Junit 5/6 and other things are completely different

cshannon avatar Nov 21 '25 18:11 cshannon

Aside from a few tests where I had to make the data directory dynamic (so parallel executions each get their own directory), the rest of the changes are simply adding @Category on test classes that can safely run in parallel. I went through the modules manually to identify which tests were compatible, which took some time. Many of the remaining ones could be migrated as well, but they need deeper refactoring, so they aren’t part of the “quick win” I was aiming for here.

The rest is just Surefire configuration.

Regarding migrating to JUnit 5/6: I agree that @Category → @Tag and Surefire → <includeTags> is straightforward. If your suggestion is to go that route instead, I’m perfectly fine with it. Vintage is deprecated but still available in JUnit 6, so it gives us some runway.

I also fully agree that most tests ultimately need to be modernized. Many still inherit from TestCase, so we’ll need to add proper @Test annotations, use before/after hooks, and switch to assertions for timeouts and exceptions. With the current codebase, that’s a substantial amount of work. I know the community is interested in pushing towards that modernization—which is great—but in the meantime getting faster CI feedback (2–3 hours instead of ~7 hours) would already be a big improvement.

What do you think?

jeanouii avatar Nov 22 '25 10:11 jeanouii

If I understand the comment about JUnit 6, I think it's a separate and more global effort.

This PR is focusing on test performance improvements on the MQTT module. We need consistency on the contribution.

I would rather focus on improvements without other changes, and do the JUnit 6 upgrade effort in another PR.

jbonofre avatar Nov 22 '25 10:11 jbonofre

Regarding migrating to JUnit 5/6: I agree that @category → @tag and Surefire → is straightforward. If your suggestion is to go that route instead, I’m perfectly fine with it. Vintage is deprecated but still available in JUnit 6, so it gives us some runway.

Thanks for pointing out the @tag annotation, I read up a bit on it. I wasn't trying to say we should combine Junit 6 with the performance work, etc. just simply that I didn't want to make it harder to migrate. Many tests are easy to migrate from Junit 4 but some things are much harder (for example only in 5.13 did they finally add class level parameterized tests, etc). I wasn't sure if using the @Category tag was going to cause issues with the migration and further delay it, but because we can simply move to @Tag I think it's fine to go ahead with this PR. It looks pretty straight forward so it should not be an issue.

cshannon avatar Nov 22 '25 17:11 cshannon

I did some more review and I was able to add more tests to parallel execution. I also reworked the output for clarity (see email thread I sent).

From 14'23 locally to 3'34 locally

[INFO] --- surefire:3.5.3:test (parallel) @ activemq-mqtt ---
[INFO] Using configured provider org.apache.maven.surefire.junitcore.JUnitCoreProvider
[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] ---org.apache.activemq.transport.mqtt.auto.MQTTAutoSslAuthTest - 2.150 s
[INFO]    +-- [OK] testMQTT311Connection[scheme=auto+nio+ssl] - 2.058 s
[INFO]    '-- [OK] testMQTT311Connection[scheme=auto+ssl] - 0.086 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTMaxFrameSizeTest - 2.370 s
[INFO]    +-- [OK] testFrameSizeNotExceededWorks[mqtt] - 1.326 s
[INFO]    +-- [OK] testFrameSizeToLargeClosesConnection[mqtt] - 0.020 s
[INFO]    +-- [OK] testFrameSizeNotExceededWorks[mqtt+ssl] - 0.787 s
[INFO]    +-- [OK] testFrameSizeToLargeClosesConnection[mqtt+ssl] - 0.053 s
[INFO]    +-- [OK] testFrameSizeNotExceededWorks[mqtt+nio] - 0.034 s
[INFO]    +-- [OK] testFrameSizeToLargeClosesConnection[mqtt+nio] - 0.017 s
[INFO]    +-- [OK] testFrameSizeNotExceededWorks[mqtt+nio+ssl] - 0.052 s
[INFO]    '-- [OK] testFrameSizeToLargeClosesConnection[mqtt+nio+ssl] - 0.076 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTCodecTest - 2.710 s
[INFO]    +-- [OK] testMessageDecodingCorrupted - 1.158 s
[INFO]    +-- [OK] testMessageDecodingPerformance - 1.451 s
[INFO]    +-- [OK] testConnectWithCredentialsBackToBack - 0.023 s
[INFO]    +-- [OK] testProcessInBytes - 0 s
[INFO]    +-- [OK] testProcessInChunks - 0 s
[INFO]    +-- [OK] testEmptyConnectBytes - 0.012 s
[INFO]    +-- [OK] testMessageDecoding - 0.006 s
[INFO]    '-- [OK] testConnectThenSubscribe - 0 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTOverlapedSubscriptionsTest - 3.451 s
[INFO]    '-- [OK] testMqttResubscribe - 3.449 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTProtocolConverterTest - 4.042 s
[INFO]    +-- [OK] testConnectWithInvalidProtocolVersionToHigh - 4.023 s
[INFO]    +-- [OK] testConcurrentOnTransportError - 0.015 s
[INFO]    '-- [OK] testConnectWithInvalidProtocolVersionToLow - 0.001 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTWillTest - 5.030 s
[INFO]    +-- [OK] testRetainWillMessage - 4.985 s
[INFO]    '-- [OK] testWillMessage - 0.040 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTSubscriptionRecoveryTest - 6.886 s
[INFO]    +-- [OK] testDurableSubscriptionsAreRecovered[mqtt-virtual-topic-subscriptions] - 4.499 s
[INFO]    '-- [OK] testDurableSubscriptionsAreRecovered[mqtt-default-subscriptions] - 2.375 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTCompositeQueueRetainedTest - 9.690 s
[INFO]    '-- [OK] testSendMQTTReceiveJMSCompositeDestinations - 9.683 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTPingReqTest - 5.227 s
[INFO]    +-- [OK] testPingReqWithoutConnectFail[mqtt-version:3.1] - 5.010 s
[INFO]    +-- [OK] testPingReqConnectSuccess[mqtt-version:3.1] - 0.037 s
[INFO]    +-- [OK] testPingReqWithoutConnectFail[mqtt-version:3.1.1] - 0.133 s
[INFO]    '-- [OK] testPingReqConnectSuccess[mqtt-version:3.1.1] - 0.024 s
[INFO] ---org.apache.activemq.transport.mqtt.MQTTAuthTest - 19.55 s
[INFO]    +-- [OK] testWildcardRetainedSubscriptionLocked[mqtt] - 2.129 s
[INFO]    +-- [OK] testWildcardRetainedSubscription[mqtt] - 1.025 s
[INFO]    +-- [OK] testBadCredentialExceptionGetsCorrectErrorCode[mqtt] - 0.058 s
[INFO]    +-- [OK] testInvalidClientIdGetCorrectErrorCode[mqtt] - 0.024 s
[INFO]    +-- [OK] testFailedSubscription[mqtt] - 1.038 s
[INFO]    +-- [OK] testPublishWhenNotAuthorizedDoesNotStall[mqtt] - 0.014 s
[INFO]    +-- [OK] testBadUserNameOrPasswordGetsConnAckWithErrorCode[mqtt] - 0.032 s
[INFO]    +-- [OK] testAnonymousUserConnect[mqtt] - 0.046 s
[INFO]    +-- [OK] testWildcardRetainedSubscriptionLocked[mqtt+ssl] - 2.342 s
[INFO]    +-- [OK] testWildcardRetainedSubscription[mqtt+ssl] - 1.965 s
[INFO]    +-- [OK] testBadCredentialExceptionGetsCorrectErrorCode[mqtt+ssl] - 0.236 s
[INFO]    +-- [OK] testInvalidClientIdGetCorrectErrorCode[mqtt+ssl] - 0.123 s
[INFO]    +-- [OK] testFailedSubscription[mqtt+ssl] - 1.306 s
[INFO]    +-- [OK] testPublishWhenNotAuthorizedDoesNotStall[mqtt+ssl] - 0.355 s
[INFO]    +-- [OK] testBadUserNameOrPasswordGetsConnAckWithErrorCode[mqtt+ssl] - 0.055 s
[INFO]    +-- [OK] testAnonymousUserConnect[mqtt+ssl] - 0.096 s
[INFO]    +-- [OK] testWildcardRetainedSubscriptionLocked[mqtt+nio] - 1.044 s
[INFO]    +-- [OK] testWildcardRetainedSubscription[mqtt+nio] - 1.183 s
[INFO]    +-- [OK] testBadCredentialExceptionGetsCorrectErrorCode[mqtt+nio] - 0.075 s
[INFO]    +-- [OK] testInvalidClientIdGetCorrectErrorCode[mqtt+nio] - 0.021 s
[INFO]    +-- [OK] testFailedSubscription[mqtt+nio] - 1.203 s
[INFO]    +-- [OK] testPublishWhenNotAuthorizedDoesNotStall[mqtt+nio] - 0.084 s
[INFO]    +-- [OK] testBadUserNameOrPasswordGetsConnAckWithErrorCode[mqtt+nio] - 0.085 s
[INFO]    +-- [OK] testAnonymousUserConnect[mqtt+nio] - 0.125 s
[INFO]    +-- [OK] testWildcardRetainedSubscriptionLocked[mqtt+nio+ssl] - 1.146 s
[INFO]    +-- [OK] testWildcardRetainedSubscription[mqtt+nio+ssl] - 1.205 s
[INFO]    +-- [OK] testBadCredentialExceptionGetsCorrectErrorCode[mqtt+nio+ssl] - 0.284 s
[INFO]    +-- [OK] testInvalidClientIdGetCorrectErrorCode[mqtt+nio+ssl] - 0.222 s
[INFO]    +-- [OK] testFailedSubscription[mqtt+nio+ssl] - 1.411 s
[INFO]    +-- [OK] testPublishWhenNotAuthorizedDoesNotStall[mqtt+nio+ssl] - 0.254 s
[INFO]    +-- [OK] testBadUserNameOrPasswordGetsConnAckWithErrorCode[mqtt+nio+ssl] - 0.079 s
[INFO]    '-- [OK] testAnonymousUserConnect[mqtt+nio+ssl] - 0.262 s
[INFO] ---org.apache.activemq.transport.mqtt.PahoMQTTTest - 31.03 s
[INFO]    +-- [OK] testSubs - 2.648 s
[INFO]    +-- [OK] testActiveMQWildCards1 - 0.361 s
[INFO]    +-- [OK] testCleanSession - 4.292 s
[INFO]    +-- [OK] testOverlappingTopics - 6.426 s
[INFO]    +-- [OK] testLotsOfClients - 13.81 s
[INFO]    +-- [OK] testClientIdSpecialChars - 3.170 s
[INFO]    '-- [OK] testSendAndReceiveMQTT - 0.318 s
[INFO] ---org.apache.activemq.transport.mqtt.PahoMQTTNIOTest - 31.03 s
[INFO]    +-- [OK] testSubs - 2.667 s
[INFO]    +-- [OK] testActiveMQWildCards1 - 0.358 s
[INFO]    +-- [OK] testCleanSession - 4.386 s
[INFO]    +-- [OK] testOverlappingTopics - 6.364 s
[INFO]    +-- [OK] testLotsOfClients - 13.74 s
[INFO]    +-- [OK] testClientIdSpecialChars - 3.188 s
[INFO]    '-- [OK] testSendAndReceiveMQTT - 0.323 s
[INFO] ---org.apache.activemq.transport.mqtt.PahoVirtualTopicMQTTTest - 32.03 s
[INFO]    +-- [OK] testVirtualTopicQueueRestore - 2.820 s
[INFO]    +-- [OK] testSubs - 0.366 s
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  03:34 min
[INFO] Finished at: 2025-11-24T13:29:15+01:00
[INFO] ------------------------------------------------------------------------

jeanouii avatar Nov 24 '25 12:11 jeanouii

In the CI, went from 20:56 to 5'01

jeanouii avatar Nov 25 '25 10:11 jeanouii

Are you guys waiting for something else on this one?

jeanouii avatar Dec 01 '25 20:12 jeanouii

Thanks @mattrpav for the review. You asked to split the big PR into a PR per module, that's why I put everything in the module. I can for sure extract to the parent pom if you prefer. But we will need to solve probably conflicts for all other PRs FYI

jeanouii avatar Dec 07 '25 22:12 jeanouii

@jeanouii checkout my minor update to your branch.

  1. Use properties for versions of plugins
  2. Move the 'advanced' surefire plugin configuration to a profile

Please rebase and squash your commits to one commit and I think this one is ready to merge.

mattrpav avatar Dec 10 '25 15:12 mattrpav

@mattrpav really I don't see the benefit of moving the tests running in parallel into another profile. Not only it creates yet another profile, but there is a risk that the profile is skipped or not activated and therefore the coverage is lighter. When I first started working on running tests in parallel, I had one constraint in mind: it should not require another profile to activate and it must run at least the same number of tests as today. So not only there is now a new profile and we have too many already in my opinion, but there is a risk of regression as Romain pointed out. So if you want to keep it separated, that's fine, but I don't think it's a good thing.

jeanouii avatar Dec 12 '25 13:12 jeanouii