spring-boot icon indicating copy to clipboard operation
spring-boot copied to clipboard

Health indicators based on Service Level Objectives

Open jkschneider opened this issue 4 years ago • 6 comments

This feature adds support for commonly requested functionality for an application to be able to aggregate some set of metrics key performance indicators down to a health indicator.

I fully expect some changes, probably significant changes, based on feedback iterations on this, but want to offer this up early in the 2.4.0 release iteration so we have time to iterate and also dogfood any autoconfigured service level objectives.

Some indicators are known to be broadly applicable to a wide range of Java applications, and those could be autoconfigured. An example of a set of such indicators is defined here and autoconfigured by this pull request (JvmServiceLevelObjectives.MEMORY).

In many cases, users would like to configure a load balancer to avoid instances that are failing a key performance indicator by configuring an HTTP health check on the load balancer. In fact, some applications may already be doing this for the health indicators Spring Boot or users already provide. Example platform load balancer configurations that can be pointed to /actuator/health:

metadata:
  name: instance-reported-utilization
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-port: "80"
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-protocol: "http"
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-path: "/actuator/health"

See https://github.com/micrometer-metrics/micrometer/issues/2055 for more detail.

The HealthMeterRegistry

As of 1.6.0, Micrometer has a new implementation: micrometer-registry-health. An autoconfiguration was added to spring-boot-actuator-autoconfigure for this new implementation.

Any @Bean ServiceLevelObjective is configured onto the HealthMeterRegistry and bound as a Spring Boot HealthIndicator.

What it looks like in /actuator/health

image

About ServiceLevelObjective

Service level objectives broadly have the following capabilities:

  • Are defined as a single or multi-indicator test against a set of time series registered to HealthMeterRegistry.
  • Can define required MeterBinder that contain the measurements that they need to determine availability.
  • Contains a filterable and transformable name and tag set that is mapped to the Spring Boot bean name and Health#details map, respectively.
  • Optionally contains a readable base unit that is mapped to health details.
  • Can pretty-print values and thresholds for human-readable interpretation of an SLO at some instant.
  • Can be defined to look back and aggregate over a time window in different ways.

API error ratio property-driven configuration

management.metrics.export.health.api-error-budgets.api.customer=0.01
management.metrics.export.health.api-error-budgets.admin=0.02

The above properties result in two service level objective health indicators called apiErrorRatioApiCustomer and apiErrorRatioAdmin, which check for a SERVER_ERROR outcome to total throughput ratio of less than 1% for requests to paths starting with /api/customer and 2% for requests to paths starting with /admin, respectively.

jkschneider avatar May 04 '20 22:05 jkschneider