data-api-builder icon indicating copy to clipboard operation
data-api-builder copied to clipboard

⭐ [Enhancement]: New Comprehensive Health Endpoint

Open JerryNixon opened this issue 1 year ago • 2 comments

What is it?

  • Adds configuration info to the health endpoint.
  • Includes endpoint basics in the health response.
  • Allows thresholds for response time validation.

Overall Health Calculation

Healthy Unhealthy Global Status
- 0 Healthy
- ≥ 1 Unhealthy
  • Healthy: All checks pass.
  • Unhealthy: At least one check fails.

Standard Health Check Schema

{
  "status": "Healthy",
  "checks": [
    {
      "name": "{name}",
      "status": "Healthy",
      "tags": ["database", "performance"],
      "description": "Checks database query response time.",
      "data": {
        "duration-ms": 50,
        "threshold-ms": 100
      }
    },
    {
      "name": "{name}",
      "status": "Unhealthy",
      "tags": ["database", "performance"],
      "description": "Checks database query response time.",
      "error": "Database connection failed",
      "data": {
        "duration-ms": 150,
        "threshold-ms": 100
      }
    }
  ]
}

Configuration for Data API Builder

Global Health Configuration

{
  "runtime": {
    "health": {
      "enabled": true,
      "cache-ttl": 5,
      "max-dop": 5,
      "roles": ["anonymous", "authenticated"]
    }
  }
}
Property Type Required Default Description
enabled Boolean No true Enables or disables health checks.
cache-ttl Integer No 5 Caches health check results for X seconds.
max-dop Integer No 8 Max parallel health check operations.
roles Array Yes [ ] Defines who can access health checks.

Database-Specific Health Checks

{
  "data-source": {
    "health": {
      "name": "sqlserver",
      "enabled": true,
      "threshold-ms": 100
    }
  }
}
Property Type Required Default Description
name String No NULL Identifier for multi-DB setups.
enabled Boolean No true Enables/disables DB health checks.
query String No N/A Custom SQL for health validation.
threshold-ms Integer No 10000 Max query response time in ms.

Entity-Level Health Checks

{
  "<entity-name>": {
    "health": {
      "enabled": true,
      "first": 1,
      "threshold-ms": 100
    }
  }
}
Property Type Required Default Description
enabled Boolean No true Enables/disables health check for entity.
first Integer No 1 Number of records checked.
threshold-ms Integer No 10000 Max response time before failure.

Example Full Health Check Response

{
  "status": "Unhealthy",
  "version": "1.2.10",
  "app-name": "dab_oss_1.2.10",
  "configuration": {
    "http": true,
    "https": true,
    "rest": true,
    "graphql": true,
    "telemetry": true,
    "caching": true,
    "mode": "development",
    "dab-configs": [
      "/App/dab-config.json (mssql)"
    ],
    "dab-schemas": [
      "/App/schema.json"
    ]
  },
  "checks": [
    {
      "name": "database-moniker",
      "status": "Healthy",
      "tags": ["database", "performance"],
      "description": "Checks if the database is responding within an acceptable timeframe.",
      "data": {
        "duration-ms": 50,
        "threshold-ms": 100
      }
    },
    {
      "name": "database-moniker",
      "status": "Unhealthy",
      "tags": ["database", "performance"],
      "description": "Database response exceeded the threshold.",
      "data": {
        "duration-ms": 150,
        "threshold-ms": 100
      }
    },
    {
      "name": "<entity-name>",
      "status": "Healthy",
      "tags": ["endpoint", "performance"],
      "description": "Checks if the <entity-name> endpoint responds within the threshold.",
      "data": {
        "duration-ms": 50,
        "threshold-ms": 100
      }
    },
    {
      "name": "<entity-name>",
      "status": "Unhealthy",
      "tags": ["endpoint", "performance"],
      "description": "Endpoint exceeded response time threshold.",
      "data": {
        "duration-ms": 150,
        "threshold-ms": 100
      },
      "error": "{exception-message-here}"
    }
  ]
}

Final Notes

status is always at the top level.
checks contains all individual results.
data holds relevant details like duration-ms.

JerryNixon avatar Sep 06 '24 18:09 JerryNixon

Another healthcheck example: DabHealthCheck.cs https://github.com/Azure/data-api-builder/blob/bbe1851df86065245d2bdd342d8b75a9304f2e00/src/Service/HealthCheck/DabHealthCheck.cs#L17

seantleonard avatar Sep 27 '24 16:09 seantleonard

It looks like https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks has support across the four supported data sources for DAB, would it be easier to add those internally to surface up the health checks, or at least treat them as additive to DAB-specific ones?

Once there is some native health check info surfaced by DAB, I'd love to get it integrated in the .NET Aspire Community Toolkit integration (tracking via https://github.com/CommunityToolkit/Aspire/issues/190).

aaronpowell avatar Oct 31 '24 08:10 aaronpowell