ConcurrentModificationExceptions from UiInternals
Description of the bug
When interacting with the UI over multiple threads, the UiInternals class sometimes throws ConcurrentModificationExceptions. These even occur within the access/accessSyncronously Methods (first and second stacktrace).
Stacktrace 1:
java.util.ConcurrentModificationException: null at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:1023) at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:1046) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1950) at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:129) at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:527) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:632) at com.vaadin.flow.component.internal.UIInternals.isDirty(UIInternals.java:1268) at com.vaadin.flow.component.UI.push(UI.java:757) at com.vaadin.flow.server.VaadinSession.unlock(VaadinSession.java:769) at com.vaadin.flow.server.VaadinSession.accessSynchronously(VaadinSession.java:1020)
Stacktrace 2:
java.util.ConcurrentModificationException: null at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:1023) at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:1046) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1950) at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:129) at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:527) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:632) at com.vaadin.flow.component.internal.UIInternals.isDirty(UIInternals.java:1268) at com.vaadin.flow.component.UI.push(UI.java:757) at com.vaadin.flow.server.VaadinSession.unlock(VaadinSession.java:769) at com.vaadin.flow.server.VaadinService.ensureAccessQueuePurged(VaadinService.java:2352) at com.vaadin.flow.server.VaadinService.accessSession(VaadinService.java:2319) at com.vaadin.flow.server.VaadinSession.access(VaadinSession.java:1065) at com.vaadin.flow.component.UI.access(UI.java:574) at com.vaadin.flow.component.UI.access(UI.java:557)
Stacktrace 3:
java.util.ConcurrentModificationException: null at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:1023) at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:1046) at java.lang.Iterable.forEach(Iterable.java:74) at com.vaadin.flow.component.internal.UIInternals.dumpPendingJavaScriptInvocations(UIInternals.java:645) at com.vaadin.flow.server.communication.UidlWriter.createUidl(UidlWriter.java:187) at com.vaadin.flow.server.communication.UidlRequestHandler.createUidl(UidlRequestHandler.java:185) at com.vaadin.flow.server.communication.UidlRequestHandler.writeUidl(UidlRequestHandler.java:174) at com.vaadin.flow.server.communication.UidlRequestHandler.synchronizedHandleRequest(UidlRequestHandler.java:137) at com.vaadin.flow.server.SynchronizedRequestHandler.handleRequest(SynchronizedRequestHandler.java:63) at com.vaadin.flow.server.VaadinService.handleRequest(VaadinService.java:1879) at com.vaadin.flow.server.VaadinServlet.service(VaadinServlet.java:398) at com.vaadin.cdi.CdiVaadinServlet.service(CdiVaadinServlet.java:66) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:614) at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
Stacktrace 4:
java.util.ConcurrentModificationException: null at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:1023) at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:1046) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1950) at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:129) at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:527) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230) at java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:632) at com.vaadin.flow.component.internal.UIInternals.containsPendingJavascript(UIInternals.java:721) at com.vaadin.flow.component.UI.replaceStateIfDiffersAndNoReplacePending(UI.java:1926) at com.vaadin.flow.component.UI.browserNavigate(UI.java:1907) at com.vaadin.flow.component.ComponentEventBus.fireEventForListener(ComponentEventBus.java:244) at com.vaadin.flow.component.ComponentEventBus.handleDomEvent(ComponentEventBus.java:501) at com.vaadin.flow.component.ComponentEventBus.lambda$addDomTrigger$dd1b7957$1(ComponentEventBus.java:303) at com.vaadin.flow.internal.nodefeature.ElementListenerMap.lambda$fireEvent$2(ElementListenerMap.java:475) at java.util.ArrayList.forEach(ArrayList.java:1596) at com.vaadin.flow.internal.nodefeature.ElementListenerMap.fireEvent(ElementListenerMap.java:475) at com.vaadin.flow.server.communication.rpc.EventRpcHandler.handleNode(EventRpcHandler.java:62) at com.vaadin.flow.server.communication.rpc.AbstractRpcInvocationHandler.handle(AbstractRpcInvocationHandler.java:79) at com.vaadin.flow.server.communication.ServerRpcHandler.handleInvocationData(ServerRpcHandler.java:568) at com.vaadin.flow.server.communication.ServerRpcHandler.lambda$handleInvocations$6(ServerRpcHandler.java:549) at java.util.ArrayList.forEach(ArrayList.java:1596) at com.vaadin.flow.server.communication.ServerRpcHandler.handleInvocations(ServerRpcHandler.java:549) at com.vaadin.flow.server.communication.ServerRpcHandler.handleRpc(ServerRpcHandler.java:376) at com.vaadin.flow.server.communication.UidlRequestHandler.synchronizedHandleRequest(UidlRequestHandler.java:136) at com.vaadin.flow.server.SynchronizedRequestHandler.handleRequest(SynchronizedRequestHandler.java:63) at com.vaadin.flow.server.VaadinService.handleRequest(VaadinService.java:1879) at com.vaadin.flow.server.VaadinServlet.service(VaadinServlet.java:398) at com.vaadin.cdi.CdiVaadinServlet.service(CdiVaadinServlet.java:66) at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:614)
Expected behavior
UiInternals should never throw ConcurrentModificationException.
Minimal reproducible example
Not consistently reproducible. Seems to occur when interacting with a page while a non-ui thread calls Ui.access/accessSynchronously.
Versions
- Vaadin / Flow version: 24.9.5
- Java version: 21
These even occur within the access/accessSyncronously Methods (first and second stacktrace)
The issue in 99% of all cases is that there's one thread that doesn't do proper lock that triggers the issue and then the issue is detected by another thread that does proper locking.
There is probably nothing we can do without a way of reproducing this issue since the ConcurrentModificationException works like it does for very good reasons.
But what could solve the issue would be using modern concurrent structures in the UiInternals -> Deque and so on
It would further hide the underlying problem that application logic updates UI state from multiple threads at the same time. For that to work, all components, including all add-ons and all views in the application would also have to be built with concurrency in mind.
Something as simple as layout.add(child); would become a very complex matter since there's a brief moment in time when the parent pointer and the child pointer are not in sync with each other.
We could for instance no longer guarantee that a component is still attached at the time when an attach listener is invoked because there's no guarantee that another thread hasn't already detached the component.
The framework has a checkHasLock method to check that operations that touch UI state are run with appropriate locking. This is enabled by default in development mode but in production mode for performance reasons only when the JVM has assertions enabled (i.e. run with the -ea flag). The purpose of these checks is to help the application developer realize if they're accidentally doing something that they're not supposed to do.
All the four reported stacktraces revolve around iterating the UIInternals.pendingJsInvocations collection. The content of that collection is modified from addJavaScriptInvocation which is guarded by checkHasLock and removePendingInvocation which is only indirectly guarded. The remove method can be triggered either when the owning component is detached, or when the pending invocation is completed. At least the regular path to detaching is also guarded through the check in StateTree.markAsDirty(StateNode). Completing an invocation should only happen from request handling code in the framework which I assume can only happen with appropriate locking.
What this means is that I cannot find any reasonable code path that would lead to these ConcurrentModificationExceptions cases without triggering checkHasLock. Could it be that the application has some code paths without appropriate locking that are only encountered when running in production mode? Could you either try running the JVM with -ea or with the sessionLockCheckStrategy configuration option set to throw to detect any such case?
I have tried running it with -ea but it slowed the system down to a point where I could not sufficiently test some parts of our application. I'm going to try running it with a sessionLockCkeckStrategy and see what I can find.
@Legioth I have now done some additional testing. Using the changed sessionLockCheckStrategy, I could only find one code occurrence where VaadinSession.getAttribute was called without access. But following the path, this should not lead to any changes in the pendingJsInvocations Set. I could not find any other issues that led to any additional outputs. But our error management system indicates that these errors mainly occur on systems with high loads, which I cannot replicate myself.
But I also don't quite understand why methods within Ui and VaadinSession that are known to require a lock, as indicated by the use of checkHasLock, don't force the lock themselves when they are invoked.
But I also don't quite understand why methods within Ui and VaadinSession that are known to require a lock, as indicated by the use of checkHasLock, don't force the lock themselves when they are invoked.
It's a performance optimization since checking if the lock is held requires an atomic read operation. That's why the check is by default a no-op in production but actively checking in development mode.
But following the path, this should not lead to any changes in the pendingJsInvocations Set
Then it's most likely happening through some unexpected code path without any checkHasLock(). I don't see any other reasonable way of getting more info than adding those missing checks to all places in the framework that touch pendingJsInvocations.
I added some additional checks in https://github.com/vaadin/flow/pull/22783. You can subscribe to that PR to get a notification when it's included in a release.
There's one code path without checkHasLock() that is somewhat likely to be used from outside the framework: PendingJavaScriptResult.cancelExecution(). If you want to dig deeper already before the additional checkHasLock() calls are in a release, then you could check for such invocations in application code or add-ons.
@Legioth in addition to some calls to cancelExecution, we also found some unwrapped calls to getElement().executeJs, that are only called in specific situations. If you can confirm that executeJs on any Element may cause the ConcurrentModificationException, then I think we can close this issue.
executeJs only modifies pendingJsInvocations through a StateTree.beforeClientResponse callback rather than directly when invoked. On the other hand, calling StateTree.beforeClientResponse to schedule that callback goes through checkHasLock(). A no-op checkHasLock() with the default settings in production mode can also lead to a ConcurrentModificationException but it's in that case for the StateTree.pendingExecutionNodes field that would give a different stacktrace than the ones reported here.
There's also an alternative path in case the element isn't currently attached. That one only adds an attach listener without directly touching pendingJsInvocations but also without checkHasLock() since there's not even any session to check with for a detached element.
This leaves the unguarded cancelExecution calls as the main culprit for the observed ConcurrentModificationException. The next releases for 24.8, 24.9 and 25.0 will have checkHasLock() on that code path as well to make the issue more easy to notice.
The additional checks have been released in Vaadin 24.9.6. If you could check whether the new checks would have caught the issues you had before you fixed them, then that would be useful.