activemq
activemq copied to clipboard
[AMQ-9394] Tech Preview: Virtual Thread support
Tasks:
- [x] Throw exception if virtualThreadTaskRunner is enabled and JDK 21 (or higher) is not available
- [x] Breakout the Virtual Thread factory init so it only logs on JDK 17
- [x] Add webpage with instructions and implementation progress status (https://activemq.apache.org/virtual-threads)
- [x] Update 'yield()' usage to have proper syntax
- [x] Add a VirtualThreadTaskRunner
- [x] Add activemq-client-jdk21-test module
- [x] Add activemq-client-jdk21 to the assembly
- [x] Add @Experimental annotation to communicate PREVIEW status
- [ ] Add multi-consumer, multi-queue test results
VirtualThreadTaskRunnerBrokerTest results:
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.activemq.broker.VirtualThreadTaskRunnerBrokerTest
[INFO] Tests run: 127, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.01 s -- in org.apache.activemq.broker.VirtualThreadTaskRunnerBrokerTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 127, Failures: 0, Errors: 0, Skipped: 0
[DRAFT] Virtual Thread Phase 1 roadmap:
- (This PR) Provide hooks to enable Virtual Thread executor to get clock started on non-production environment testing and allow for profiling.
- Refactor all Thread Local references (there are only a few, non-impacting)
- Refactor Queue/Topic usage of synchronized -> reentrantLock
- Replace object monitor mutexes with Condition (not a lot of these either)
Once the above is done, that should provide the basic improvements for Virtual Thread scaling for lots of queues and topics. Once that is done, we can start the heavier refactoring to reduce blocking altogether.
[DRAFT] Virtual Thread Phase 2 roadmap:
- Add fast-return on queue iterate() to return when queue size is zero (no work needed)
- Refactor other areas that use a separate thread to just use a task or run in the queue iterate() -- expiration checking, producer flow control, etc
- Refactor client-side thread handling to do Future -> CompletableFutures. (I think the client-side is especially primed to take advantage of modern JDK threading features.)
- Update ConnectionFactory to support injecting a globally shared ExecutorService. So for high-density deployments you could have lots of connection factories using a shared Virtual Thread pool across REST service, other JMS integrations, etc.
I should also add the caveat that while I said in my last comment I think it makes no sense to stick with the TaskRunner
interface that there is still a possibility we may ultimately need to. We have a lot of custom code and impl there so we need to really dive into it to see if we can get rid of it or not or if we need to keep parts of it. There's also the possibility that we could refactor it but it's so much work it isn't worth it, but I hope not.
My initial assumption is we should be able to refactor things with modern concurrent features of java and get rid of some of that but obviously TBD. So, I don't think you should delete any of this code as maybe we end up having to use the new virtual thread task runner you created, but since that's TBD and also a tech preview that is being tested, for now I think it's best in another branch for until we figure out the plan.
I think we should take a look into the current implementations and see if they can be removed or improve/simplified as part of this. I can take a look soon.
@cshannon thanks for taking the time to review and providing the thoughtful feedback.
I agree, the threading model can be improved and leveraging more modern Java constructs would improve performance and maintainability. Overall, I think ActiveMQ is a good candidate for Virtual Threads, because the key locks are already ReentrantLocks and we have NIO on the network layer. There are a few scatter synchronized methods/blocks in queue and topic, but those can be readily refactored.
From my research of other OSS projects' (Tomcat, etc), they took a similar approach as I have proposed here -- getting Virtual Thread executor support as a configurable piece, and then use that to identify hot spots through profiling and end-user testing. From that perspective, I do feel strongly that we should work to get a configurable approach into the hands of end-users so we can get runtime hours in non-production environments. Unfortunately, I think the days of power users testing from a branch are behind us, so if we could work to some sort of compromise where its in the dist, but not guaranteed to not change I think we get the benefit of end-user testing which is really critical for this type of change.
My intent with 'Tech Preview' is to communicate that Virtual Thread support is available for testing, but not guaranteed to remain unchanged. I added the webpage to communicate that as well.
My initial comments of not to include it were because I thought you meant with Tech Preview as it's just for testing, etc and not really intended for users yet. If your intent is for users to try it out and then I think it would be ok to merge if we mark it as experimental/beta in the code itself besides just the documentation.
As I already stated, I want to look at refactoring all of the TaskRunner
usage, so it's possible the final result of this looks completely different and I just didn't want to merge something in that was a preview/testing feature that might break users.
Maybe we could add something similar to @Beta annotation that Guava has to mark it as a preview feature and subject to breaking changes or removal as we don't know how it will go and I don't want to add it in if we can't get rid of it later.
My guess is by the time we hit AMQ 7.0 it would certainly be considered stable but could be earlier if we did work on the threading.
I suppose the other result of this change is it requires JDK 21 to build going forward for releases which I'm not sure how I feel about. The modules are optional but obviously we need to build them if we plan to release them.
My initial comments of not to include it were because I thought you meant with Tech Preview as it's just for testing, etc and not really intended for users yet. If your intent is for users to try it out and then I think it would be ok to merge if we mark it as experimental/beta in the code itself besides just the documentation.
Sounds good, I've added additional tasks here and will publish additional testing results.
Maybe we could add something similar to @beta annotation that Guava has to mark it as a preview feature and subject to breaking changes or removal as we don't know how it will go and I don't want to add it in if we can't get rid of it later.
A bit hacky, but we could use @Deprecated with text that it is really beta info. Thoughts?
My guess is by the time we hit AMQ 7.0 it would certainly be considered stable but could be earlier if we did work on the threading.
Yep!
I would not use @deprecated as that means something completely different. We should probably just create our own annotation and call it @Experimental
or @Beta
etc
ARTEMIS-4937 reminded me to ask this question as I forgot, have you tried benchmarking this all or trying to verify we won't run into performance issues with pinning? I know your PR mentions refactoring but it hasn't been done and synchronized blocks are used all over the place. So I assume this will suffer from that for now, but it of course depends if it's a real issue but something we should communicate if it is.
Looks like at least the issue is on it's way to being solved (hopefully for the next LTS release): https://mail.openjdk.org/pipermail/loom-dev/2024-May/006632.html
If testing shows there are issues with performance/pinning (and I'm sure there are) we may want to wait to release this as experimental or not, it would defeat the purpose of releasing it. Or at the very least have a big warning etc.
I personally don't think we should go around and just start messing with replacing a bunch of synchronized blocks just because of this. While we do need to fix the locking and synchronization in the broker, that is a major effort and needs to be well planned and is non-trival and not something that would likely happen in a 6.x release as it would be nice to modernize things for 7.x. And because there's a fix coming in the JDK (hopefully with the 25 LTS release) we could just wait for that.
Other projects like Caffeine are waiting for the fix as well.
@cshannon agree. Whether it is called “tech preview” or “experimental” is not a big deal to me.
This first pass is about getting some modules so unit/itests and profiling work can begin. Having a module allows other contributors start looking at it as well. This is sufficiently complex to get going, there is no user “accidentally” turns this on in production.
I will post test tool commands and results before merging any of this.
As far as refactoring goes, I agree. I don’t think any wholesale search+replace of synchronized is a safe approach for stability. I was thinking about modernization of the code base, and there are plenty of extension points we could leverage to test new code paths that are more VT-friendly using new impls vs refactor-in-place.
TransportConnector PersistenceAdapter QueueFactory (new idea to return different queue impls based on policy entry) .. etc
The problem here is the stated goals you just listed argue for NOT releasing this. Releases are for delivering to end users, but the goal here is not that. The goals are for profiling, testing, unit tests and for other contributors to get involved.
The more I think about this the more I think it's just a really bad idea to deliver something that is completely untested and not even close to ready and likely has actual problems to end users.
Instead i think a better approach is what we did on Accumulo when we had our long running "elasticity" branch. We can create a branch for this work in repo that way others can contribute and we can work on the feature until it's closer to being ready.
- We should create a new branch (call it something like virtual-threads or whatever you want) and use that as the basis of this work.
- We can keep it up to date by periodically merging main into the branch (merge only, no rebases as we don't want to break history)
- Developers can open up Pull Requests against this new branch for virtual thread changes so that we can work on pieces at a time and merge them in as they are done.
- When it's ready, we merge this branch back into main.
So I will say that maintaining another branch can be a bit of a pain so I think the other option I would be ok with if we want to release this now would be to just improve the documentation a lot.
The feature is of course marked as a tech preview and experimental but it's not super clear as to the drawbacks or warnings if you try and use it. This page only currently lists benefits and why you would want to use it so it's kind of almost tempting an end user to want to turn it on prematurely.
So I think maybe the following:
- Update the documentation page to talk about more about the current state and how far a long the implementation is. We should make it clear it's just for testing and evaluation and basically define why it's a Tech Preview and experimental.
- The page has a benefits section but we could also add a section on potential problems and warnings for now.
- Update all the Javadocs for the new classes to give information on potential warnings and pitfalls. They are marked with the experimental annotation but there's not a lot of information as to why.