dubbo-go icon indicating copy to clipboard operation
dubbo-go copied to clipboard

feat: add HTTP health check API for Kubernetes integration

Open solisamicus opened this issue 2 months ago • 3 comments

feat: add HTTP health check API for Kubernetes integration

Add comprehensive HTTP health check support for liveness and readiness probes to enable seamless Kubernetes deployment and container orchestration.

Background

Kubernetes requires HTTP endpoints for health probes to determine container lifecycle:

  • Liveness probes: detect if container should be restarted
  • Readiness probes: control traffic routing to healthy instances

This implementation provides a robust, extensible health check system that integrates with dubbo-go's existing architecture while allowing users to define custom health logic.

Implementation

Core Components

  • HealthCheckConfig: YAML-configurable settings for ports, paths, and timeouts
  • HealthChecker Interface: Extensible interface for user-defined health logic
  • HealthCheckServer: HTTP server exposing /health/live and /health/ready endpoints
  • Built-in Checkers: Ready-to-use implementations for common scenarios
  • Global Registry: Thread-safe system for managing health checkers

Key Features

  • Configurable Endpoints: Customizable ports and paths via configuration
  • Extensible Logic: Users can implement HealthChecker interface for custom checks
  • Composite Support: Combine multiple health checkers with AND logic
  • Detailed Responses: JSON responses with status, timestamp, and detailed information
  • Timeout Protection: Configurable timeouts prevent hanging health checks
  • Graceful Integration: Seamless integration with dubbo-go lifecycle management
  • Thread Safety: Concurrent-safe health checker registry

Changes Made

  • Add HealthCheckConfig to MetricsConfig with port, paths, timeout settings
  • Implement HealthChecker interface for user-defined health check logic
  • Add HealthCheckServer with /health/live and /health/ready endpoints
  • Support CompositeHealthChecker for combining multiple checkers
  • Provide built-in checkers: Default, Dubbo, GracefulShutdown, Timeout
  • Integrate with metrics system for automatic server lifecycle management
  • Add thread-safe global health checker registry
  • Support detailed health results with JSON response format
  • Include graceful shutdown integration and timeout protection

Usage

Configuration

metrics:
  enable: true
  health-check:
    enabled: true
    port: "8080"
    live-path: "/health/live"
    ready-path: "/health/ready"
    timeout: "10s"

Custom Health Checker

type MyHealthChecker struct {
    db *sql.DB
}

func (m *MyHealthChecker) CheckLiveness(ctx context.Context) bool {
    return true // Process is alive
}

func (m *MyHealthChecker) CheckReadiness(ctx context.Context) bool {
    return m.db.Ping() == nil // Check database connection
}

func (m *MyHealthChecker) Name() string {
    return "MyApp"
}

// Register custom checker
server.SetHealthChecker(&MyHealthChecker{db: database})

Kubernetes Deployment

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 3

API Responses

Healthy Response (200 OK)

{
  "status": "UP",
  "timestamp": 1640995200000,
  "details": {
    "check": "readiness",
    "database": "connected",
    "services": "exported"
  }
}

Unhealthy Response (503 Service Unavailable)

{
  "status": "DOWN", 
  "timestamp": 1640995200000,
  "message": "Database connection failed",
  "details": {
    "check": "readiness",
    "database": "unavailable",
    "reason": "connection_timeout"
  }
}

Benefits

  • Cloud Native: Full Kubernetes compatibility with standard probe endpoints
  • Zero Downtime: Proper readiness checks enable rolling deployments
  • Fault Tolerance: Automatic unhealthy instance removal from load balancers
  • Observability: Detailed health status information for debugging
  • Extensibility: Plugin architecture for custom health logic
  • Production Ready: Timeout protection, graceful shutdown, error handling
  • Backward Compatible: Disabled by default, no impact on existing deployments

Testing

Health check endpoints can be tested using:

curl http://localhost:8080/health/live
curl http://localhost:8080/health/ready

Fixes: https://github.com/apache/dubbo-go/issues/2039 Enables: Kubernetes-native health monitoring and container lifecycle management

solisamicus avatar Oct 10 '25 06:10 solisamicus

plz add unit test

No-SilverBullet avatar Oct 10 '25 08:10 No-SilverBullet

pls fix ci fail

Alanxtl avatar Oct 19 '25 12:10 Alanxtl