aspire icon indicating copy to clipboard operation
aspire copied to clipboard

Supporting health probes on resources

Open GMouaad opened this issue 1 year ago • 8 comments

Added first draft of supporting health probes on resources

Contributes to #890

Microsoft Reviewers: Open in CodeFlow

GMouaad avatar May 10 '24 17:05 GMouaad

Thanks for opening the PR to start the discussion. We need to decide what reads these annotations and how they appear in the manifest.

davidfowl avatar May 10 '24 17:05 davidfowl

@davidfowl Thanks for the quick feedback, you were quicker then my comment in the issue :D.

We need to decide what reads these annotations and how they appear in the manifest.

Definitly, I mentioned this in the open points on my comment no the issue. AFAIK, the only think that the platforms would need is to know the endpoint type (plain TCP or HTTP Request) and on which port it should call the resource (the "internal" port and not the exposed one I would imagine).

Maybe some input from the DCP folks is needed (?)

My initial idea is to just make as simple as possible and reflect the common field needed. It would look something like the follwing:

{
  "resources": {
    "HelloWorldApp": {
      "type": "container.v0",
      "image": "docker.io/mgsair/hello-world-java:latest",
      "env": {
        "SERVER_PORT": "8000"
      },
      "bindings": {
        "endpoint": {
          "scheme": "http",
          "protocol": "tcp",
          "transport": "http",
          "containerPort": 8000
        }
      },
      "probes": [
        {
          "type": "liveness",
          "httpGet": {
            "path": "/health",
            "port": 8000
          },
          "initialDelaySeconds": 5,
          "periodSeconds": 5
        }
      ]
    },
    "apiservice": {
      "type": "project.v0",
      "path": "../AspireWithContainers.ApiService/AspireWithContainers.ApiService.csproj",
      "env": {
        "OTEL_DOTNET_EXPERIMENTAL_OTLP_EMIT_EXCEPTION_LOG_ATTRIBUTES": "true",
        "OTEL_DOTNET_EXPERIMENTAL_OTLP_EMIT_EVENT_LOG_ATTRIBUTES": "true",
        "services__HelloWorldApp__0": "{HelloWorldApp.bindings.endpoint.url}"
      },
      "bindings": {
        "http": {
          "scheme": "http",
          "protocol": "tcp",
          "transport": "http"
        },
        "https": {
          "scheme": "https",
          "protocol": "tcp",
          "transport": "http"
        }
      }
    },
    "webfrontend": {
      "type": "project.v0",
      "path": "../AspireWithContainers.Web/AspireWithContainers.Web.csproj",
      "env": {
        "OTEL_DOTNET_EXPERIMENTAL_OTLP_EMIT_EXCEPTION_LOG_ATTRIBUTES": "true",
        "OTEL_DOTNET_EXPERIMENTAL_OTLP_EMIT_EVENT_LOG_ATTRIBUTES": "true",
        "services__apiservice__0": "{apiservice.bindings.http.url}",
        "services__apiservice__1": "{apiservice.bindings.https.url}"
      },
      "bindings": {
        "http": {
          "scheme": "http",
          "protocol": "tcp",
          "transport": "http"
        },
        "https": {
          "scheme": "https",
          "protocol": "tcp",
          "transport": "http"
        }
      }
    }
  }
}

I'll try to include this in my draft so other parties can play around with it maybe.

Any thoughts?

GMouaad avatar May 10 '24 18:05 GMouaad

After thinking about it, I think it's not trivial to figure out which port should be used, from the example: 8000or {HelloWorldApp.bindings.endpoint.port}, i.e. endpointReference.Port or endpointReference.GetExpression(EndpointProperty.Port). Though I suppose the former.

GMouaad avatar May 10 '24 18:05 GMouaad

Like @davidfowl said thanks for opening up the PR.

I'm wondering whether we should model probes as a sub-field on the endpoint? The other thing to think about here is whether there is some kind of default we apply to the probe path based on the defaults we have in Aspire application libraries.

mitchdenny avatar May 13 '24 01:05 mitchdenny

Thank you for the feedback @mitchdenny !

The other thing to think about here is whether there is some kind of default we apply to the probe path based on the defaults we have in Aspire application libraries.

I think that would be either something we can add in the service defaults or as supplementary extensions that uses the WithProbe(..) under the hood?

GMouaad avatar May 18 '24 21:05 GMouaad

@GMouaad sorry we didn't get around to going into the design here for the release. The team is quite heads down right now on the Aspire 9 release. I think you've got a pretty good start with this PR, the APIs look great (I added some comments).

The bigger changes come around adding runtime and deployment support for these. For the runtime support, we just added a health checking system to the apphost, this might just be plugged into that. cc @mitchdenny

For deployment we have more options to figure out how this maps

davidfowl avatar Sep 30 '24 22:09 davidfowl

Now that we have WaitFor in which ties into health checks I think that there is a fairly clear path here now of how probes (especially readiness probes) might work for local development. Consider the following code:

var builder = DistributedApplication.CreateBuilder(args);
builder.AddProject<Projects.Project1>("project1")
       .WithReadinessProbe("endpointName", "/healthz", ...);

This would add a readiness probe annotation to the model which could be synthesised into a health check annotation which could then be used to do things like this:

var builder = DistributedApplication.CreateBuilder(args);
var p1 = builder.AddProject<Projects.Project1>("project1")
                .WithReadinessProbe("endpointName", "/healthz", ...);
builder.AddProject<Projects.Projec2>("p2")
       .WithReference(p1.GetEndpoint("https")).WaitFor(p1);

mitchdenny avatar Oct 01 '24 05:10 mitchdenny

@davidfowl I thought also that you'll be busy with the release topics. Thank you for the feedback. Regarding the health checking system, great news! Thanks @mitchdenny for the sample and hints.

I'm currently abroad, I'll resolve the issues and comments and play around with the new APIs as soon as come back!

GMouaad avatar Oct 02 '24 05:10 GMouaad

@davidfowl @mitchdenny I messed I'm up my branch while trying it from another IDE. I reopened the PR with the change from your feedback in #6379. Sorry for the inconvenience..

I'll close this one for the moment.

GMouaad avatar Oct 18 '24 19:10 GMouaad