camunda icon indicating copy to clipboard operation
camunda copied to clipboard

Flaky io.camunda.zeebe.client.OAuthCredentialsProviderTest.shouldCallOauthServerOnlyOnceInMultithreadMode

Open deepthidevaki opened this issue 11 months ago • 3 comments

Summary

Try to answer the following as best as possible

  • How often does the test fail?
  • Does it block your work?
  • Do we suspect that it is a real failure?

Failures

java.util.concurrent.CompletionException: io.grpc.StatusRuntimeException: UNKNOWN: Application error processing RPC
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
	at java.base/java.util.concurrent.CompletableFuture$BiRelay.tryFire(CompletableFuture.java:1498)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2179)
	at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.onNext(ZeebeClientFutureImpl.java:85)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:468)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:667)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:654)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: io.grpc.StatusRuntimeException: UNKNOWN: Application error processing RPC
	at io.grpc.Status.asRuntimeException(Status.java:533)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
	... 5 more

Logs

Test results for unit tests.zip

deepthidevaki avatar Feb 29 '24 08:02 deepthidevaki

ZPA-Triage:

  • error indicates the response has an unexpected status, we need to check the server side

megglos avatar Mar 04 '24 14:03 megglos

Interesting section in the logs mentioned in the issue:

21:49:57.859 [main] INFO  org.eclipse.jetty.server.AbstractConnector - Started NetworkTrafficServerConnector@7c3b7d7c{HTTP/1.1, (http/1.1, h2c)}{0.0.0.0:43469}
21:49:57.859 [main] INFO  org.eclipse.jetty.server.Server - Started Server@53466b67{STARTING}[11.0.20,sto=1000] @10212ms
WARNING: A Java agent has been loaded dynamically (/home/runner/.m2/repository/cached/releases/net/bytebuddy/byte-buddy-agent/1.14.12/byte-buddy-agent-1.14.12.jar)
WARNING: If a serviceability tool is in use, please run with -XX:+EnableDynamicAgentLoading to hide this warning
WARNING: If a serviceability tool is not in use, please run with -Djdk.instrument.traceUsage for more information
WARNING: Dynamic loading of agents will be disallowed by default in a future release
Feb 28, 2024 9:49:59 PM io.grpc.internal.SerializingExecutor run
SEVERE: Exception while executing runnable io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@48c6165e
java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 0
	at java.base/java.util.ArrayList.add(ArrayList.java:484)
	at java.base/java.util.ArrayList.add(ArrayList.java:496)
	at io.camunda.zeebe.client.util.RecordingGatewayService.handle(RecordingGatewayService.java:620)
	at io.camunda.zeebe.client.util.RecordingGatewayService.topology(RecordingGatewayService.java:399)
	at io.camunda.zeebe.gateway.protocol.GatewayGrpc$MethodHandlers.invoke(GatewayGrpc.java:2336)
	at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
	at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:351)
	at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:861)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

21:49:59.538 [main] INFO  org.eclipse.jetty.server.Server - Stopped Server@53466b67{STOPPING}[11.0.20,sto=1000]
21:49:59.538 [main] INFO  org.eclipse.jetty.server.Server - Shutdown Server@53466b67{STOPPING}[11.0.20,sto=1000]
21:49:59.539 [main] INFO  org.eclipse.jetty.server.AbstractConnector - Stopped NetworkTrafficServerConnector@7c3b7d7

korthout avatar Mar 04 '24 14:03 korthout

ZPA Planning: We don't have enough capacity this iteration to look into this (a few flaky tests already in this iteration)

aleksander-dytko avatar Apr 29 '24 13:04 aleksander-dytko

Unassigned due to unavailability

korthout avatar May 27 '24 13:05 korthout

@koevskinikola As discussed, maybe this one is interesting as part of your next topic (REST API OAuth provider)

korthout avatar Jun 03 '24 13:06 korthout

According to the logs attached, to me, there seems to have been a concurrency issue, possibly a race condition where a request was being added to be handled in the RecordingGatewayService, in the same time another was removed in the midst of processing the adding, resulting in a different array (inside the arraylist) size.

The issue could be local only to the test, and could have been an extremely rare case that is unlikely to happen again.

If my speculation is correct, we could try to overcome it by using a BlockingDeque or a Collections.synchronizedList in place of the ArrayList in the RecordingGatewayService

mustafadagher avatar Jun 21 '24 10:06 mustafadagher