aws-mobile-appsync-sdk-android
aws-mobile-appsync-sdk-android copied to clipboard
How to handle java.lang.IllegalStateException at RealAppSyncCall.java:383
@cbommas Hello.
We've caught exception very similar to issue #65 with similar stacktrace on executing a lot of mutations but on AppSync 2.7.8:
Fatal Exception: java.lang.IllegalStateException: Expected: TERMINATED, but found [ACTIVE, CANCELED]
at com.apollographql.apollo.internal.RealAppSyncCall.responseCallback(RealAppSyncCall.java:383)
at com.apollographql.apollo.internal.RealAppSyncCall.access$000(RealAppSyncCall.java:72)
at com.apollographql.apollo.internal.RealAppSyncCall$1.onResponse(RealAppSyncCall.java:272)
at com.amazonaws.mobileconnectors.appsync.InterceptorCallback.onResponse(InterceptorCallback.java:127)
at com.apollographql.apollo.internal.interceptor.ApolloCacheInterceptor$1$1.onResponse(ApolloCacheInterceptor.java:102)
at com.apollographql.apollo.internal.interceptor.ApolloParseInterceptor$1.onResponse(ApolloParseInterceptor.java:84)
at com.apollographql.apollo.internal.interceptor.ApolloServerInterceptor$1$1.onResponse(ApolloServerInterceptor.java:110)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:764)
@palpatim @desokroshan any updates?
@DeMoss15 Thanks for reporting the issue. Can you please share some specific reproduction steps such as relevant code snippet and average number of mutations you are running. Any other information that you think you think might help reproduce this issue?
Hi @desokroshan
Our app is facing the same issue. We can't reproduce it but got that crash after our app rolled out to users.
Here is what I found in the AwsAppSync code:
In the RealAppSyncCall.java, it has this method responseCallback()
`private synchronized Optional<Callback<T>> responseCallback() {
switch (state.get()) { case ACTIVE: case CANCELED: return Optional.fromNullable(originalCallback.get()); case IDLE: case TERMINATED: throw new IllegalStateException( CallState.IllegalStateMessage.forCurrentState(state.get()).expected(ACTIVE, CANCELED)); default: throw new IllegalStateException("Unknown state"); } }`
As you can see if the state is TERMINATED -> it will throw out an error "CallState.IllegalStateMessage.forCurrentState(state.get()).expected(ACTIVE, CANCELED));"
The above method is called in 2 places: onResponse() and onFetch() callbacks in interceptorCallbackProxy() method. Because onResponse() and onFetch() didn't catch errors in responseCallback() so if somehow the state is TERMINATED, it will crash the app. The same problem in onFailure() callback, the library didn't catch the error while calling terminate() method.
@desokroshan is it possible to catch and throw it out to the user via this callback override fun onFailure(@Nonnull e: ApolloException)
when this method is triggered AppSyncQueryCall().enqueue()
?
Hi @desokroshan @palpatim
We're seeing this issue on a daily basis for our Android users. Unfortunately, I cannot provide a straight-forward way on how to reproduce it but we'll try to provide you with as much detail as we can.
The curious detail is that the exception is thrown only across Samsung devices that run Android 11 and when the app is on background, across 100% all cases that we've collected through Firebase Crashlytics. We've seen this crash 79 times in 37 users on a beta test group of around 2k users.
We're planning to roll out this app to a much larger user-base. This is a persistent issue we have had with AppSync and still see it. We've already tried switching back to several older versions, tried the new v3.1.4 as well but the problem is present on all of them.
Since the problem is present on Samsung devices / Android 11 while on background, I'd recommend that you read https://dontkillmyapp.com/samsung as the cause may be that the OS is limiting the background work the app can do.
If you were to explain a bit further why we would be seeing this error, I'd be able to bring more insights on why this would be happening.
We use several mutations (20+?) and queries as well.
Hello. Are there any update about this? I'm facing the same issue.
same here
Fatal Exception: java.lang.IllegalStateException Expected: TERMINATED, but found [ACTIVE, CANCELED]
Fatal Exception: java.lang.IllegalStateException: Expected: TERMINATED, but found [ACTIVE, CANCELED] at com.apollographql.apollo.internal.RealAppSyncCall.responseCallback(RealAppSyncCall.java:373) at com.apollographql.apollo.internal.RealAppSyncCall.access$000(RealAppSyncCall.java:62) at com.apollographql.apollo.internal.RealAppSyncCall$1.onResponse(RealAppSyncCall.java:262) at com.amazonaws.mobileconnectors.appsync.InterceptorCallback.onResponse(AppSyncOfflineMutationInterceptor.java:117) at com.apollographql.apollo.internal.interceptor.ApolloCacheInterceptor$1$1.onResponse(ApolloCacheInterceptor.java:92) at com.apollographql.apollo.internal.interceptor.ApolloParseInterceptor$1.onResponse(ApolloParseInterceptor.java:71) at com.apollographql.apollo.internal.interceptor.ApolloServerInterceptor$1$1.onResponse(ApolloServerInterceptor.java:100) at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:519) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637) at java.lang.Thread.run(Thread.java:1012)
This occurs to us too, this happens often enough during our testing.
- We trigger appsync events atleast twice every second
- we changed the value of mutationQueueExecutionTimeout to 10 seconds (AWSAppSyncClient.builder()..mutationQueueExecutionTimeout(10000))
We saw the following crash occurring intermittently:
10-22 08:10:33.485 20188 22030 E AndroidRuntime: java.lang.IllegalStateException: Expected: TERMINATED, but found [ACTIVE, CANCELED] 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.apollographql.apollo.internal.g.a(RealAppSyncCall.java:11) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.apollographql.apollo.internal.f.onResponse(RealAppSyncCall.java:1) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.amazonaws.mobileconnectors.appsync.InterceptorCallback.onResponse(AppSyncOfflineMutationInterceptor.java:18) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.apollographql.apollo.internal.interceptor.a$a$a.onResponse(ApolloCacheInterceptor.java:8) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.apollographql.apollo.internal.interceptor.e$a.onResponse(ApolloParseInterceptor.java:6) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at com.apollographql.apollo.internal.interceptor.f$a$a.onResponse(ApolloServerInterceptor.java:4) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at okhttp3.internal.connection.g$a.run(RealCall.kt:12) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641) 10-22 08:10:33.485 20188 22030 E AndroidRuntime: at java.lang.Thread.run(Thread.java:920)
We also observed the crash occuring more often when we set lower values of mutationQueueExecutionTimeout
Hi @desokroshan , doing a code review of version 3.3.2, i found certain race conditions where 2 threads read from the queue and handle the same mutation in parallel.
This seems to be tied to setting mutationQueueExecutionTimeout to a low number like 200ms and triggering 2 mutations per second.
I am reproduce it very consistently with the above configuration. I suspect the mutationQueueExecutionTimeout doesn't cancel the previous mutation completely allowing it to continue execution and restarts the same mutation in a new thread (after reading the same mutation from the top of either the persistent or memory queue). This seemed to cause the above exception.