aspire Initial Integration of Prometheus Monitoring & Grafana Dashboards

trafficstars

Issue: #1037

This PR represents an initial exploration into integrating Prometheus and Grafana monitoring into the eShopLite sample. It's an early-stage implementation focused on establishing a more streamlined approach to monitoring within our microservices architecture.

Changes

Introduced PrometheusContainerResource to encapsulate Prometheus container-specific configurations such as ConfigFilePath and DataVolumeName.
Developed PrometheusBuilderExtensions, featuring the AddPrometheusContainer method, enabling the inclusion of Prometheus containers with necessary annotations and volume mounts.

Goals and Intent

Stage: Currently in an early conceptual phase, focusing on a high-level approach to integrating these monitoring tools rather than detailed technical specifics.
Intent: The objective is to assess the feasibility and potential structure of incorporating Prometheus and Grafana, with an understanding that this is an initial step subject to further development and iteration.

Outlined Objectives and Challenges

Configuring Prometheus and Grafana: Exploring three distinct methods for configuring Prometheus and Grafana:
1. User-Provided Config File Paths: Allowing users to specify paths to configuration files directly in the AddPrometheusContainer method.
2. No Config File Paths Provided: Handling scenarios where no configuration file paths are provided by the user.
3. Configuration via Code (Callback or Builder Approach): Implementing a code-based configuration approach for Prometheus, similar to custom middleware logic in .NET, but focused on Prometheus settings.
```
var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                        .ConfigureWithCallback(configurator =>
                        {
                            configurator.AddScrapeJob("frontendJob", "http://frontend/metrics");
                            configurator.AddScrapeJob("orderProcessorJob", "http://orderprocessor/metrics");
                            // Additional configuration as needed
                        });
```

or


var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                        .ConfigureWithCallback(configurator => configurator
                            .AddScrapeJob("frontendJob", "http://frontend/metrics")
                            .AddScrapeJob("orderProcessorJob", "http://orderprocessor/metrics")
                            // Additional configuration methods can be chained here
                        );

Mapping Configuration to Container: Developing a strategy to translate the code-based configuration (third approach) into effective configuration settings within the container environment. This could involve using environment variables or creating in-memory file streams.
Full Implementation of Scrape Method: Completing the Scrape method implementation once the configuration-to-container mapping strategy is finalized.
Grafana Integration: Utilizing the configuration mapping strategy to simplify the integration of Grafana, enhancing the monitoring setup.
Default Config File Directory: Establishing a default directory for Prometheus and Grafana configuration files, to be determined either now or in a later phase.

EndGoal


/// Goal: Add a prometheus and grafana container to aspire
builder.AddProject<Projects.MyFrontend>("frontend");
builder.AddProject<Projects.OrderProcessor>("orderprocessor");
builder.AddProject<Projects.ApiGateway>("apigateway");
builder.AddProject<Projects.CatalogDb>("catalogdbapp");

var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                        .Scrape(builder.GetProject("frontend"))
                        .Scrape(builder.GetProject("orderprocessor"));
                        // Additional scrapes can be added here if needed

var grafana = builder.AddGrafanaContainer("grafana", "grafana.ini", "grafana-data")
                     .WithDashboard("eShopLite.json")
                     .WithDashboard("eShopLite2.json")
                     .AddDataSource("prometheus", "http://prometheus:9090");
                     // If you have other data sources, you can add them here

builder.Build().Run();

Dec 02 '23 05:12 josephaw1022

Is your intention for this Prometheus container to be deployable via deployment tools such as azd and Aspirate (k8s tool for Aspire manifests)?

If so, it is worth considering how the configuration of this container will work at deployment time when the application model is serialized to a manifest. Looking at the above examples you could hit a few challenges. For example the callback won't be able to get the final URLs for the services because that isn't determined until the AppHost has finished executing.

Dec 09 '23 21:12 mitchdenny

Is your intention for this Prometheus container to be deployable via deployment tools such as azd and Aspirate (k8s tool for Aspire manifests)?

If so, it is worth considering how the configuration of this container will work at deployment time when the application model is serialized to a manifest. Looking at the above examples you could hit a few challenges. For example the callback won't be able to get the final URLs for the services because that isn't determined until the AppHost has finished executing.

The intention of the pr is to get something working for just prometheus for local development for now. I haven't looked too much into it and now that you bring up Aspirate, Ithink having it work with aspirate might be easier than the "azd" approach simply because the host name is just the kubernetes service name. But anyway, I am just going to stay focused on basic local functionality for now and then discuss that once this pr is more functional and presentable so to speak.

Dec 09 '23 22:12 josephaw1022

I don't think we should add Prometheus/Grafana to the eshop sample in this repo. There is a sample at https://github.com/dotnet/aspire-samples/tree/main/samples/Metrics that uses these containers. When that repo gets this change then it can be updated to use the new API.

Dec 12 '23 04:12 JamesNK

I don't think we should add Prometheus/Grafana to the eshop sample in this repo. There is a sample at https://github.com/dotnet/aspire-samples/tree/main/samples/Metrics that uses these containers. When that repo gets this change then it can be updated to use the new API.

@JamesNK I agree and its not intended to be there permanently. It's simply there so that anyone who pulls this branch locally can quickly verify functionality at the moment.

I am trying to figure out an approach to programmatically adding something along the lines of

prometheusContainer.scrape(app1) prometheusContainer.scrape(app2)

And how to map that to some temporary file that would be passed into the volume. Im not really sure what approach to do with this

Dec 12 '23 04:12 josephaw1022

@davidfowl just merged in a change to allow WithReference to take a connection string type. Perhaps we can use that pattern as inspiration.

You could do something like this:

var app1 = builder.AddProject<Projects.App1>(...);
var app2 = builder.AddProject<Projects.App1>(...);
var randomContainer = builder.AddContainer(...);
var prom = builder.AddPrometheusContainer(...)
    .WithReference(new ScrapeReference(app1))
    .WithReference(new ScrapeReference(app2))
    .WithReference(new ScrapeReference(randomContainer));

@davidfowl I'm wondering if your connection string PR actually shows us a way forward here. We retain WithReference but we have an overload that takes some kind of "IResourceReference" type which dictates how that reference interacts with what it is being injected into. Its kind of an alternative to the GetConnectionString(resource) pattern you were looking at.

Dec 13 '23 02:12 mitchdenny

@davidfowl just merged in a change to allow WithReference to take a connection string type. Perhaps we can use that pattern as inspiration.

You could do something like this:
var app1 = builder.AddProject<Projects.App1>(...);
var app2 = builder.AddProject<Projects.App1>(...);
var randomContainer = builder.AddContainer(...);
var prom = builder.AddPrometheusContainer(...)
    .WithReference(new ScrapeReference(app1))
    .WithReference(new ScrapeReference(app2))
    .WithReference(new ScrapeReference(randomContainer));
@davidfowl I'm wondering if your connection string PR actually shows us a way forward here. We retain WithReference but we have an overload that takes some kind of "IResourceReference" type which dictates how that reference interacts with what it is being injected into. Its kind of an alternative to the GetConnectionString(resource) pattern you were looking at.

That's a good idea. I like that. @mitchdenny

Right now I am needing a little help in how exactly we want to do this or if another approach is needed to this problem

So, we know a prometheus container needs the config file and it looks like this.

global:
  scrape_interval: 1s # makes for a good demo

scrape_configs:
  - job_name: 'catalog'
    static_configs:
      - targets: ['host.docker.internal:5193'] # hard-coded port matches launchSettings.json
  - job_name: 'orders'
    static_configs:
      - targets: ['host.docker.internal:49713'] # hard-coded port matches launchSettings.json
  - job_name: 'frontend'
    static_configs:
      - targets: ['host.docker.internal:5266'] # hard-coded port matches launchSettings.json

and I want a way for this file to be generated by c# likes this


var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                       .Scrape(builder.GetProject("frontend"))
                       .Scrape(builder.GetProject("orderprocessor")); 

//or 

var app1 = builder.AddProject<Projects.App1>(...);
var app2 = builder.AddProject<Projects.App1>(...);
var randomContainer = builder.AddContainer(...);
var prom = builder.AddPrometheusContainer(...)
   .WithReference(new ScrapeReference(app1))
   .WithReference(new ScrapeReference(app2))
   .WithReference(new ScrapeReference(randomContainer));

and then put into the prometheus volume

I found a tool to help with generating the yaml files: https://github.com/aaubry/YamlDotNet?tab=readme-ov-file

and in practice generating the yaml file would something like this (detiails of the code aren't too important, all that is important in this context is that it's generating the yaml file and it's structure)


   public void BuildConfigFile()
   {
       var scrapeConfigs = new List<dynamic>();
       foreach (var project in container.ScrapeTargets)
       {
           scrapeConfigs.Add(new
           {
               job_name = project.Name,
               static_configs = new List<dynamic>
               {
                   new { targets = new List<string> { $"host.docker.internal:{project.Port}" } }
               }
           });
       }

       var config = new
       {
           ["global"] = new { scrape_interval = "1s" },
           ["scrape_configs"] = scrapeConfigs
       };

       var serializer = new SerializerBuilder()
                           .WithNamingConvention(CamelCaseNamingConvention.Instance)
                           .Build();
       var yaml = serializer.Serialize(config);

       File.WriteAllText(container.ConfigFile, yaml);
   }

The question that I have is, Where do we want the generated yaml file to go? Do we want it hidden from the user in the obj directory or do we want this to be generated in specific location that can be modified?

You might be thinking there has to be a way to do this via some in line cli or docker arguments, and I thought that might be a suitable alternative approach but when looking at the documentation

https://prometheus.io/docs/prometheus/latest/configuration/configuration/

The documentation states the following,

Prometheus is configured via command-line flags and a configuration file. While the command-line flags configure immutable system parameters (such as storage locations, amount of data to keep on disk and in memory, etc.), the configuration file defines everything related to scraping jobs and their instances, as well as which rule files to load.

So, it doesn't really seem there's an alternative approach besides just creating a yaml file and then mounting it into the prometheus volume.

Dec 16 '23 16:12 josephaw1022

I'm just thinking how this might be able to work all the way to production via a deployment tool. Even though prometheus takes a configuration file I think there is the ability to use variable substitution in that configuration file. You could for local development use volume mappings to inject fully formed configuration files, but for manifest generation you could emit a dockerfile.v0 resource and and have a generated dockerfile which includes the YAML file with placeholder values, which you then inject via environment variables.

Dec 18 '23 04:12 mitchdenny

I spent some time on this https://x.com/depechie/status/1740866116665401716?s=46&t=Mk5w5BXVOkxfBqMWVn-TOw

it turns out you can do env variable substitution in most places now (this wasn’t always true). We can clean this up and offer methods to make this easier.

Dec 30 '23 03:12 davidfowl

I'm just thinking how this might be able to work all the way to production via a deployment tool. Even though prometheus takes a configuration file I think there is the ability to use variable substitution in that configuration file. You could for local development use volume mappings to inject fully formed configuration files, but for manifest generation you could emit a dockerfile.v0 resource and and have a generated dockerfile which includes the YAML file with placeholder values, which you then inject via environment variables.

@mitchdenny
@davidfowl @prom3theu5

I looked into variable substitution with Prometheus and saw that its under a feature flag and to be honest, I am a little confused on how what exactly its doing and what problem it solves just because I have never gotten that deep into prometheus besides pod annotation and basic docker compose configuration scaffolding.

Looking at this https://promlabs.com/blog/2021/05/16/whats-new-in-prometheus-2-27/#environment-variable-expansion-for-external-labels

I'm guessing it's somewhat equivalent to a dockerfile argument (sort of ish) where we just add extra arguments that are used in the actual config file?

But anyway...

One issue that I thought of with what you are talking about is matching up the host/service name of the deployed container app to the correct job in the scrape config. Like there's no way to know with hard coded values.

here is the prometheus yaml file found in the aspire samples repo to give some context


global:
  scrape_interval: 1s # makes for a good demo

scrape_configs:
  - job_name: 'metricsapp'
    static_configs:
      - targets: ['host.docker.internal:5048'] # hard-coded port matches launchSettings.json

I think we need a process for generating our final prometheus yaml config file. So, what I am saying is we need to have the yaml file be generated under the hood and not configured by a developer manually (at least for non local environments) so that the prometheus yaml file can be with two conventions.

we need to the resource name in the manifest be 1-1 with the job name in the scrape configs section
we need the actual target written with the ${} annotation and then need to figure out to make that work in a local development environment.

We know that the resource name given for any given resource in the app host program.cs file is going to be a key in the aspire manifest file and its going to contain an corresponding object of information about it and all that. And, since azure container apps is the only first-class citizen way of deploying aspire apps with the manifest, we need a way of mapping which scrape jobs go with which container being deployed as we will have to use a different host/service name for it in the azure container apps environment. So, we would we need a way to know which ones go with which.

Let me give an example to make this clearer.

example 1 - (we just let the user define the prometheus yaml config file and dont do the two conventions I mentioned)

global:
  scrape_interval: 1s # makes for a good demo

scrape_configs:
  - job_name: 'ui-app'
    static_configs:
      - targets: ['host.docker.internal:5048'] 
  - job_name: 'orderprocessor'
    static_configs:
      - targets: ['host.docker.internal:5088']
  - job_name: 'apigateway_job'
    static_configs:
      - targets: ['host:docker.internal:9091']

and the app host program.cs file looks like this


/// Goal: Add a prometheus and grafana container to aspire
builder.AddProject<Projects.MyFrontend>("frontend");
builder.AddProject<Projects.OrderProcessor>("orderprocessor");
builder.AddProject<Projects.ApiGateway>("apigateway");
builder.AddProject<Projects.CatalogDb>("catalogdbapp");

var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                        .Scrape(builder.GetProject("frontend"))
                        .Scrape(builder.GetProject("orderprocessor"));
                        // Additional scrapes can be added here if needed

builder.Build().Run();

Well now we have no guaranteed way of knowing which jobs go with which aspire resource or if they even point to an aspire resource when deploying to production which can cause some or most of the jobs to not work properly.

example 2 - aspire managed ) we scaffold out some new functionality so that aspire is generating a prometheus config yaml file that looks something like this that we are now using for deployment

global:
  scrape_interval: 1s # makes for a good demo

scrape_configs:
  - job_name: 'frontend_job'
    static_configs:
      - targets: [${frontend_target}] 
  - job_name: 'orderprocessor_job'
    static_configs:
      - targets: [${orderprocessor_target}]
  - job_name: 'apigateway_job'
    static_configs:
      - targets: [${apigateway_target}]

and let's say the app host program.cs file looks like this. Just showing the program.cs file instead of the manifest json file because they seem to be 1-1, but the Prometheus yaml file would be directly generated from the manifest file.


/// Goal: Add a prometheus and grafana container to aspire
builder.AddProject<Projects.MyFrontend>("frontend");
builder.AddProject<Projects.OrderProcessor>("orderprocessor");
builder.AddProject<Projects.ApiGateway>("apigateway");
builder.AddProject<Projects.CatalogDb>("catalogdbapp");

var prometheus = builder.AddPrometheusContainer("prometheus", "prometheus.yml", "prometheus-data")
                        .Scrape(builder.GetProject("frontend"))
                        .Scrape(builder.GetProject("orderprocessor"));
                        // Additional scrapes can be added here if needed


builder.Build().Run();

well with the example, we can have our stateless apps deployed to azure container apps first, obtain the host/service name of the each of the deployed azure container apps, and then have the subsitution done for the prometheus yaml file done.

Takeaway of all that I am saying

So, basically what I am getting at is that I think for this to work in production for azd and azure container apps, some functionality needs to be written so that a prometheus config yaml file can be generated from simply a manifest json file to match the format given in example 2.
And then some azd / init up backend work will need to be done to handle the case of: "if a prometheus resource being defined by the manifest (lets say of type "prometheus.server0"), then deploy the apps being scraped first into azure container apps, wait for created apps (that we know are being scraped) to be healthy, obtain the host/service names for those to be scraped apps and then use those obtained host name values as our arguments for the prometheus config scrape targets (shown in example 2)"
And for aspirate, we might be able to skip the whole yaml configuration altogether and just use pod annotations on the projects that are noted as being scraped by the Prometheus container in the program.cs file of the app host project .

Dec 30 '23 04:12 josephaw1022

I think we probably aren't going to be able to solve this until we get our volume story sorted out in the app model.

Jan 10 '24 09:01 mitchdenny

Actually another way of doing this is to write a Dockerfile which replaces the startup command with something that takes a bunch of environment variables and writes them into the configuration file at startup.

Jan 10 '24 09:01 mitchdenny

Actually another way of doing this is to write a Dockerfile which replaces the startup command with something that takes a bunch of environment variables and writes them into the configuration file at startup.

Absolutely - nice idea - this works for me as I can then use that as an init container to initialize / seed a PV.

Jan 10 '24 10:01 prom3theu5

@mitchdenny

Oh so like a wrapper docker file? Are we able to run a "docker file" resource via aspire? Or would a custom wrapper image just need to be made and put on a public registry and then use it via the image resource?

Jan 13 '24 23:01 josephaw1022

@mitchdenny

I don't know how this never occured to me when writing that long winded response a few weeks ago, but it seems like resources in azure container apps seems to utilize the same service name as host name pattern (couldn't think of a better phrase or word for it) so I don't need the public ip if prometheus were to be deployed as a container (assuming 1 replica for those containers for now). However, I noticed container deployments aren't supported in azure container apps, so I am not sure if is a real way to "deploy it" besides just converting the manifest to k8 yaml files and just handling that via aspir8 or doing it manually.

Also, as far as getting it to work locally, I think I am going to try and see how far I get with what you mentioned here https://github.com/dotnet/aspire/pull/1173#issuecomment-1884513463 and see how far that takes me.

Jan 20 '24 06:01 josephaw1022

@mitchdenny

The prometheus config file can be done the same it is here 👀. Except it would obviously be a yaml writer instead and all.

Screenshot (6)

and here

Jan 27 '24 16:01 josephaw1022

@josephaw1022 we've made a lot of progress on this since we last reviewed this. AZD is actually capable of uploading files referenced in bind mounts into storage. So there might be more scope for getting this working end to end now.

May 21 '24 10:05 mitchdenny

Will close this PR for now. If you have time to come back to it feel free to re-open it.

Jun 04 '24 09:06 mitchdenny

aspire aspire copied to clipboard

Initial Integration of Prometheus Monitoring & Grafana Dashboards

Changes

Goals and Intent

Outlined Objectives and Challenges

EndGoal

example 1 - (we just let the user define the prometheus yaml config file and dont do the two conventions I mentioned)

example 2 - aspire managed ) we scaffold out some new functionality so that aspire is generating a prometheus config yaml file that looks something like this that we are now using for deployment

Takeaway of all that I am saying

aspire
aspire copied to clipboard