Provide a gRPC server health check
The problem
Monitor the health of a gRPC server app.
The solution
Provide a gRPC server HealthEndpoint. My blog post on this matter may be useful.
https://blog.asarkar.com/technical/grpc-kubernetes-spring/
Alternatives considered
Implement myself.
Additional context
Also note that starting with version 2.3.0.RELEASE, Spring Boot provides liveness and readiness information under Actuator health endpoint. See Kubernetes Probes for details.
I'm not sure what you are exactly asking for.
- Export some kind of
HealthIndicatorfor the grpc-server to spring actuator, - or use the
HealthIndicators to populate agrpcHealthEndpointcall/service that mirrors the web endpoint, - or use the
HealthIndicators to populate theHealthStatusManager
Export some kind of HealthIndicator for the grpc-server to spring actuator
This. I've a Kotlin implementation that I can share, which I'm sure can be easily turned into Java.
I'm not aware of any useful grpc-server availability indicator, so I'm very interested in your implementation. Can you link it here? I will decide, once I have seen the implementation.
EDIT: AFAIK the default http server doesn't have one either.
open class GrpcServerHealthIndicator internal constructor(private val healthChannel: ManagedChannel) : HealthIndicator {
internal constructor(port: Int, channelBuilderClass: String) : this(newChannel(port, channelBuilderClass))
private val healthStub: HealthGrpc.HealthBlockingStub = HealthGrpc.newBlockingStub(healthChannel)
private var availabilityChangeEventPublishMethod: Method? = null
private var publishAvailabilityChangeEvent: Boolean = true
private var readinessStateRefusingTraffic: Enum<*>? = null
private var livenessStateBroken: Enum<*>? = null
@Autowired
lateinit var context: ApplicationContext
@PostConstruct
open fun postConstruct() {
publishAvailabilityChangeEvent = context.containsBean("livenessStateHealthIndicator") &&
context.containsBean("readinessStateHealthIndicator")
if (publishAvailabilityChangeEvent) {
log.info("Kubernetes probes are enabled")
} else {
log.info("Kubernetes probes not enabled")
return
}
try {
val availabilityChangeEventClass =
Class.forName("org.springframework.boot.availability.AvailabilityChangeEvent")
val availabilityStateClass =
Class.forName("org.springframework.boot.availability.AvailabilityState")
availabilityChangeEventPublishMethod = ReflectionUtils.findMethod(
availabilityChangeEventClass, "publish",
ApplicationContext::class.java, availabilityStateClass
)
val readinessStateClass = Class.forName("org.springframework.boot.availability.ReadinessState")
readinessStateRefusingTraffic = readinessStateClass.enumConstants
.map { it as Enum<*> }
.firstOrNull { it.name == "REFUSING_TRAFFIC" }
val livenessStateClass = Class.forName("org.springframework.boot.availability.LivenessState")
livenessStateBroken = livenessStateClass.enumConstants
.map { it as Enum<*> }
.firstOrNull { it.name == "BROKEN" }
publishAvailabilityChangeEvent = true
} catch (ex: ReflectiveOperationException) {
publishAvailabilityChangeEvent = false
log.error(ex.message, ex)
}
}
@PreDestroy
open fun preDestroy() {
healthChannel.shutdown()
}
// c.f. org.springframework.boot.actuate.availability package
override fun health(): Health {
val request = HealthCheckRequest.getDefaultInstance()
val builder = Health.Builder()
try {
val response = healthStub.check(request)
when (response.status) {
HealthCheckResponse.ServingStatus.SERVING -> builder.up()
HealthCheckResponse.ServingStatus.NOT_SERVING -> builder.outOfService()
else -> builder.down()
}
} catch (ex: Exception) {
builder.down(ex)
}
val health = builder.build()
if (publishAvailabilityChangeEvent) {
if (health.status == Status.OUT_OF_SERVICE) availabilityChangeEventPublishMethod?.invoke(
null,
context,
readinessStateRefusingTraffic
)
else if (health.status != Status.UP) availabilityChangeEventPublishMethod?.invoke(
null,
context,
livenessStateBroken
)
}
return health
}
companion object {
private val log: Logger = LoggerFactory.getLogger(GrpcServerHealthIndicator::class.java)
private fun newChannel(port: Int, channelBuilderClass: String): ManagedChannel {
val forAddressMethod = ReflectionUtils.findMethod(
Class.forName(channelBuilderClass),
"forAddress",
String::class.java, Int::class.java
)
check(forAddressMethod != null) { "Could not find NettyChannelBuilder.forAddress(String, int) method" }
var nettyChannelBuilder = forAddressMethod.invoke(null, "localhost", port)
val usePlaintextMethod = ReflectionUtils.findMethod(
nettyChannelBuilder.javaClass,
"usePlaintext"
)
check(usePlaintextMethod != null) { "Could not find NettyChannelBuilder.usePlaintext() method" }
nettyChannelBuilder = usePlaintextMethod.invoke(nettyChannelBuilder)
val buildMethod = ReflectionUtils.findMethod(
nettyChannelBuilder.javaClass,
"build"
)
check(buildMethod != null) { "Could not find NettyChannelBuilder.build() method" }
val channel = buildMethod.invoke(nettyChannelBuilder)
return channel as ManagedChannel
}
}
}
AFAICT this code creates a HealthStub that connects to the own server to query the health service. I'm not sure which HealthService is called (no imports), but the return value doesn't contain any additional value except for "a call was successful".
The actual response value could (theoretically) be directly called via code without the network io.
I'll have a look at other server libraries whether they implement this kind of ping HealthIndicator.
As for publishing the availabilityChangeEvent, this is up to the custom user code. This library should not decide whether the application is broken/down/unavailable. The user however might use any existing HealthIndicator for this:
management.endpoint.health.group.liveness.include=livenessProbe,grpcServerHealthIndicator
which HealthService is called
The one available viaHeathStatusManager.getHealthService() and described here. I don't register it, so grpc-spring-boot-starter must be doing so.
The actual response value could (theoretically) be directly called via code without the network io.
I'm not sure I understand how, could you elaborate?
publishing the availabilityChangeEvent
After speaking with the Spring Boot team on Twitter, it appears that the intended design is not to update application availability from inside an HealthIndicator, but instead add the health indicator to the liveness and/or readiness groups.
which HealthService is called
The one available via
HeathStatusManager.getHealthService()and described here. I don't register it, sogrpc-spring-boot-startermust be doing so.The actual response value could (theoretically) be directly called via code without the network io.
I'm not sure I understand how, could you elaborate?
There is a bean for that. Fun fact: That bean is never populated with actual health data unless you do it yourself (like you would do for spring). (Aside from some trivial startup and shutdown states)
The bean/service might be removed in a future release, because it doesn't add any value by itself (or I might replace it with a bridge to actuator).
So is there anything left to do here?
So is there anything left to do here?
Didn't you say you are going to look into adding a HealthIndicator as shown?
Yes, I did. Sorry, you are right.
TODO: Check whether the web server has a "self ping" health indicator and implement one for grpc as well.
There is a bean for that.
That I noticed, but the HealthStatusManager has a reference to the Health service, it is not the Health service. When I tried to register the Health service, I got a duplicate service error. There's some code that registers the Health service, and it's not the HealthStatusManager.
Just to be clear, I'm happy someone else did it for me, but I'm just curious where it's done.
https://github.com/yidongnan/grpc-spring-boot-starter/blob/master/grpc-server-spring-boot-autoconfigure/src/main/java/net/devh/boot/grpc/server/serverfactory/AbstractGrpcServerFactory.java#L115