grpc-java
grpc-java copied to clipboard
Unavailable in grpc queries when android app goes to background
Hello! I have a problem with grpc queries in kotlin app when my app goes to background. I have a next scheme of relation between my app and backend - when backend needs to wake up a phone, I'm doing next things:
- Sending a push from the backend to the phone through the Firebase
- In my app on the phone I'm listening events from firebase.
- When I'm getting the push from the firebase, I'm creating
CoroutineWorker
- Inside of CoroutineWorker I'm going to my backend via grpc
Firebase listener:
class FCMListenerService : FirebaseMessagingService() {
override fun onMessageReceived(remoteMessage: RemoteMessage) {
Log.d(TAG, "Got push message: ${remoteMessage.rawData.toString()}, priority ${remoteMessage.priority}")
val constraints = Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.build()
val workRequest = OneTimeWorkRequestBuilder<ExecuteNextOperationWorker>().setConstraints(constraints).build()
WorkManager.getInstance(applicationContext).enqueue(workRequest)
}
Coroutine worker:
...
override suspend fun doWork(): Result {
val channel = OkHttpChannelBuilder
.forAddress(config.host, config.port)
.sslSocketFactory(deviceSslProvider.getSSLSocketFactory())
.useTransportSecurity()
.intercept(GrpcLoggerInterceptor())
.defaultServiceConfig(getServiceConfig())
.enableRetry()
.build()
val myService = MyServiceGrpcKt.MyServiceCoroutineStub(channel)
val myRequest = MyServiceOuterClass.MyRequest.build()
return myService.myOperation(myRequest)
}
It works fine, when app works in foreground, but when I'm locking the phone, after few minutes, this grpc queries are not passing to the backend - I'm getting an error:
2024-03-15 11:06:10.098 29021-5531 io.grpc.StatusException: UNAVAILABLE: Unable to resolve host my-backend.com
Also, I added http queries near before grpc queries - http queries work fine even in background mode. Seems like the problem just with grpc queries.
I thought that problem is related with https://github.com/grpc/grpc-java/issues/8860, but I've updated grpc to the last version and the problem is still reproducing.
My build.gradle file:
implementation("io.grpc:grpc-okhttp:1.62.2")
implementation("io.grpc:grpc-protobuf:1.62.2")
implementation("io.grpc:grpc-stub:1.62.2")
The DNS name resolver will usually produce a stacktrace with a Cause that may help direct you.
To rule out HTTP request having DNS cached, could you please test the same flow, but once you create gRPC Channel, call ManagedChannel.getState(true)
to force gRPC channel to connect? We suspect this issue could be related to gRPC lazy-loading the channel and not caching DNS.
Hello!
I added some logs and noticed that after phone inactivity chanel comes to TRANSIENT_FAILURE
state, so I added some logic before my requests:
val currentChannelState = channel.getState(false)
if (currentChannelState == ConnectivityState.TRANSIENT_FAILURE || currentChannelState == ConnectivityState.SHUTDOWN) {
Log.d(TAG, "Channel is in $currentChannelState switching to Idle")
channel.enterIdle()
}
Now it works correct
Also, during the debugging of this problem, I've noticed, that retries are not working. Here is my config:
.defaultServiceConfig(
mapOf(
"methodConfig" to listOf(
mapOf(
"retryPolicy" to mapOf(
"maxAttempts" to "5",
"initialBackoff" to "0.5s",
"maxBackoff" to "10s",
"backoffMultiplier" to 1.2,
"retryableStatusCodes" to listOf("UNAVAILABLE")
)
)
)
)
)
.maxRetryAttempts(5)
.enableRetry()
I see in logs just one request:
2024-03-22 14:56:41.689 22515-22562 GrpcLogger com.test.app D method -> test.app.v0.TestService/test, message -> ...
And after that, I'm getting an exception with Unavailbale
error. There is no evidence of any retries in the logs.
Could you please tell, what am I doing wrong?
UPD
Also, I tried to pass the name of my service I need to retry, it didn't work :(
methodConfig" to listOf(
mapOf(
"name" to listOf(mapOf("service" to "test.app.v0.TestService")),
"retryPolicy" to mapOf(
"maxAttempts" to "5",
...
)
)
Finally, I've passed service
and method
to the name
field of the config and now retires are working for passed method. This example was very helpful https://github.com/grpc/grpc-java/tree/master/examples/src/main/java/io/grpc/examples/retrying (btw, I found some problem with bazel for this example https://github.com/grpc/grpc-java/pull/11037)
But I expected, retries would be enabled for all methods, if specific name of the method was not passed. Also, I expected, that retries would be enabled for all services if specific name of the service was not passed.
Could you please tell what is correct behavior in theese cases?
UPD
Okay, according to the methodConfig
title of the config field, seems like behavior is correct :)
But maybe there is a way to apply retries for all services/methods of the service?
UPD According to this doc, retries should be applied to all method of the sevice, if I've passed just service name https://github.com/grpc/grpc/blob/master/doc/service_config.md#example and actually it works :) But still have no idea, how to pass retry policy for all services. And is it possible to see info about retries in logs interceptor? Now I see just info about first request Here is the interceptor:
class GrpcLoggerInterceptor(): ClientInterceptor {
override fun <ReqT : kotlin.Any?, RespT : kotlin.Any?> interceptCall(
method: MethodDescriptor<ReqT, RespT>?,
callOptions: CallOptions?,
next: Channel?
): ClientCall<ReqT, RespT> =
object : ForwardingClientCall.SimpleForwardingClientCall<ReqT, RespT>(
next?.newCall(method, callOptions)
) {
override fun sendMessage(message: ReqT) {
Log.d(TAG, "method -> ${method?.fullMethodName}, message -> $message")
super.sendMessage(message)
}
}
}
btw, initial problem may be related with android dns cache, as you supposed above. The whole error looks like:
Status{code=UNAVAILABLE, description=Unable to resolve host my-awesome-host.com, cause=java.lang.RuntimeException: java.net.UnknownHostException: Unable to resolve host "my-awesome-host.com": No address associated with hostname
at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:223)
at io.grpc.internal.DnsNameResolver.doResolve(DnsNameResolver.java:282)
at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:318)
And unfortunatelly, retries are not working if this error was thrown. It can be reporduced if you'll pass non-existent host to your retry example https://github.com/grpc/grpc-java/blob/master/examples/src/main/java/io/grpc/examples/retrying/RetryingHelloWorldClient.java#L130
A note on your service config - in your earlier comment you had a service config that you wanted to apply to all calls, but it was missing the required "name" field. But your other example config did look correct so I suppose in at least some of your testing retry was configured correctly. You can check out the service config user guide for an example config.
But more saliently, it looks like you have uncovered a bug in how the default service config is applied. I created issue #11040 to track the fix for that. Note that this does not seem like an Android specific issue.
@temawi, thanks, my config really was correct, but I suppose retries didn't work because of the problem I described here. Could you please check it? I added notes how to reproduce this case. Seems like retries are not working if we are getting an exception from dns resolver (or any other exceptions, I suppose) – is it expected behaviour? Is there any way to retry this case?
Unfortunately in situations where the first DNS lookup fails the default config will not be honored. That is a bug in gRPC that needs to get fixed as part of #11040.
Until then, what I can think of as a workaround would be to create your own NameResolver
by extending the DnsNameResolver
class and have it always return the service config you need. I think you can get by with just wrapping the doResolve()
method.
The Custom Name Resolvers user guide gives some context on this and has a link to an example custom name resolver implementation.
Did you have any other issues here besides the service config not getting applied?
@temawi, thanks, I'll try to use custom resolver
Did you have any other issues here besides the service config not getting applied?
Unfortunately, I did :(
In my android app I'm making some grpc calls right after establishing wifi connection. I'm using network callback to know when connection was established. And than I'm facing with a problem: channel stucks in CONNECTING
state. I tried to add several calls with backoff – channel still being in CONNECTING
state:
12:39:46.296 32366-32366 OBSERVER com.test.app D [type: WIFI[], state: CONNECTED/CONNECTED, reason: (unspecified), extra: , failover: false, available: true, roaming: false]
12:39:46.310 32366-32435 TestApi com.test.app D Channel is in IDLE before making request
12:39:46.310 32366-32435 TestApi com.test.app D GONNA MAKE GRPC CALL
12:39:46.403 32366-32437 GrpcLogger com.test.app D method -> test.app.v0.TestSerivce/Method
12:39:46.404 32366-32437 ErrorHandler com.test.app D Error io.grpc.StatusException: DEADLINE_EXCEEDED: ClientCall started after CallOptions deadline was exceeded. Deadline has been exceeded for 14.029396739s.
12:39:49.304 32366-32437 TestApi com.test.app D Channel is in CONNECTING before making request
12:39:49.304 32366-32437 TestApi com.test.app D GONNA MAKE GRPC CALL
12:39:49.314 32366-32436 GrpcLogger com.test.app D method -> test.app.v0.TestSerivce/Method
12:39:49.316 32366-32436 ErrorHandler com.test.app D Error io.grpc.StatusException: DEADLINE_EXCEEDED: ClientCall started after CallOptions deadline was exceeded -16,994016358 seconds ago. Name resolution delay 0,000000000 seconds.
12:39:52.314 32366-32435 TestApi com.test.app D Channel is in CONNECTING before making request
12:39:52.314 32366-32435 TestApi com.test.app D GONNA MAKE GRPC CALL
12:39:52.322 32366-32435 ErrorHandler com.test.app D Error io.grpc.StatusException: DEADLINE_EXCEEDED: ClientCall started after CallOptions deadline was exceeded -19,999490615 seconds ago. Name resolution delay 0,000000000 seconds.
If I add Thread.sleep(2000)
before making first grpc call – everything works correct. The problem is reproducing only If we are making call right after wifi connection has been etablished.
Here is the full example https://gist.github.com/rostik404/e920f25cc34ec11dde08d030ed4daea4
Also, I added deadlineAfter
to my client. Withot adding dedlineAfter
it stuckes in CONNECTING
state on first call
Just a note that I haven't forgotten about this issue - but I'm not sure how to help troubleshoot it further. This could even be an Android issue where that notification comes in a bit before the network stack is fully ready (but that's just speculation). I take it you are still facing this issue?
Hello! Thank you for continuing to help with this issue :)
This could even be an Android issue where that notification comes in a bit before the network stack is fully ready
Maybe, but HTTP calls seem to be fine. It seems like I'm having a problem only with gRPC calls. Additionally, it is interesting that retries are not helpful if the first attempt was not successful..
We noticed one thing when reviewing your question - you are using OkHttpChannelBuilder
. Can you try using AndroidChannelBuilder
instead? AndroidChannelBuilder
has code to react to the network becoming available and it's likely you are suffering from not having that behavior enabled.