aspire icon indicating copy to clipboard operation
aspire copied to clipboard

Different instances of same resource type grouped as replicas in dashboard

Open paulomorgado opened this issue 1 year ago • 16 comments

In 8.0.0-preview.6.24214.1 and 8.0.0-preview.7.24251.11, different instances of project resources are grouped as replicas, even if they are not replicas.

This happens for:

  • Structured logs
  • Traces
  • Metrics

But not for:

  • Console

paulomorgado avatar May 07 '24 10:05 paulomorgado

@JamesNK can you have a look please?

kvenkatrajan avatar May 07 '24 21:05 kvenkatrajan

I have been having this problem since I started using Aspire Dashboard (Feb 2024) for local development. Thought that was how it's meant to be until I saw a demo elsewhere where the services were grouped nicely.

The service name is passed in. And this works for Elastics APM or Graphana.

ResourceBuilder
          .CreateDefault()
          .AddService("ServiceName")
          .AddAttributes(otelAttributes);

This is what I see in Aspire version: 8.0.0-preview.6.24214.1 and Aspire version: 8.0.0-preview.7.24251.11 image

p10tyr avatar May 09 '24 13:05 p10tyr

@p10tyr, are those real replicas? I my case there are no replicas. Just instances of the same project.

Are you using just the dashboard? Or are you using the distributed application host?

paulomorgado avatar May 09 '24 13:05 paulomorgado

@p10tyr, are those real replicas? I my case there are no replicas. Just instances of the same project.

Are you using just the dashboard? Or are you using the distributed application host?

I don't know. Most of them are always empty. Just one of the GUID's has data within them. Why would I have this any way if im just running one local instance of each API ?

I'd like to just click on the top-level Replica any way. I can not

p10tyr avatar May 09 '24 14:05 p10tyr

The same thing happens to me. aspire

clt-pereira avatar May 22 '24 20:05 clt-pereira

The main issue for me, is that I don't have replicas. I have multiple independent instances of the same resource type.

paulomorgado avatar May 23 '24 07:05 paulomorgado

@adamint please have a look - try repro with multiple independent instances of the same resource type

kvenkatrajan avatar May 23 '24 11:05 kvenkatrajan

I would just like to know why this happens because in my project I only have 1 WebApi and I am uploading it locally, I couldn't find anywhere what this GUID code and this grouping around my application means.

clt-pereira avatar May 23 '24 12:05 clt-pereira

Hi @paulomorgado @clt-pereira @p10tyr, we apologize for the delay in getting to this issue. Could you confirm that this problem does not appear on the console logs page? I am having trouble reproducing this, do you have a minimal repro solution that you would be able to share?

The main issue for me, is that I don't have replicas. I have multiple independent instances of the same resource type.

@paulomorgado this may just be naming confusion. We should consider renaming this temporarily to (running instances) until replica support is complete.

I don't know. Most of them are always empty. Just one of the GUID's has data within them. Why would I have this any way if im just running one local instance of each API ?

Each individual OTLP application instance shows up under this banner. Can you share how you're creating OTLP applications? If multiple instances of the OTLP application are running, what you are showing is to be expected. @p10tyr you seem to be describing a separate issue where you have one instance of an application running but multiple grouped applications, please also share your Aspire configuration, or if possible a minimal repro.

I'd like to just click on the top-level Replica any way. I can not

Unfortunately, this view is not yet available. @kvenkatrajan

adamint avatar May 23 '24 13:05 adamint

Hi @adamint,

Each individual OTLP application instance shows up under this banner. Can you share how you're creating OTLP applications? If multiple instances of the OTLP application are running, what you are showing is to be expected

How are "OTLP applications" created?

Is it related to this?

builder.Services.AddOpenTelemetry()
    .ConfigureResource(resourceBuilder =>
    {
        resourceBuilder
            .AddService(
                serviceName: builder.Environment.ApplicationName,
                serviceVersion: serviceVersion,
                autoGenerateServiceInstanceId: true);
    });

Can I force it on Aspire to be the declared resource name?

paulomorgado avatar May 23 '24 13:05 paulomorgado

How are "OTLP applications" created?

@paulomorgado OTLP "application" specifically refers here to the service instance id that you are using for the given service name. Can you share the OpenTelemetry package version you're using? There was a bug that would cause multiple instance ids to be created during process runtime when autoGenerateServiceInstanceId is set to true, which would cause the behavior you're seeing - see https://github.com/open-telemetry/opentelemetry-dotnet/discussions/5101

It would also be helpful to note if the workaround noted here: https://github.com/open-telemetry/opentelemetry-dotnet/issues/4871 fixes the issue for you.

To clarify - there is only one process running for the application in question, right?

adamint avatar May 23 '24 13:05 adamint

@adamint, there are several processes for the same application.

Imagine you have a chat application with a central hub to relay the messages. You have 1 hub and several chat clients.

Changing to this, solved it:

builder.Services.AddOpenTelemetry()
    .ConfigureResource(resourceBuilder =>
    {
        resourceBuilder
            .AddService(
                serviceName: Environment.GetEnvironmentVariable("OTEL_SERVICE_NAME") ?? builder.Environment.ApplicationName,
                serviceVersion: serviceVersion,
                autoGenerateServiceInstanceId: Environment.GetEnvironmentVariable("OTEL_RESOURCE_ATTRIBUTES")?.Contains("service.instance.id=") != false);
    })

paulomorgado avatar May 23 '24 13:05 paulomorgado

Glad to hear that your issue is fixed. Could you share the OpenTelemetry package version you are using?

adamint avatar May 23 '24 13:05 adamint

The latest and greatest! 😄

    <PackageVersion Include="Aspire.Hosting.AppHost" Version="8.0.1" />
    <PackageVersion Include="Microsoft.Extensions.ServiceDiscovery" Version="8.0.1" />
    <PackageVersion Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.8.1" />
    <PackageVersion Include="OpenTelemetry.Exporter.Prometheus.AspNetCore" Version="1.8.0-rc.1" />
    <PackageVersion Include="OpenTelemetry.Extensions.Hosting" Version="1.8.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.8.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.GrpcNetClient" Version="1.8.0-beta.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.Http" Version="1.8.1" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.Process" Version="0.5.0-beta.5" />
    <PackageVersion Include="OpenTelemetry.Instrumentation.Runtime" Version="1.8.1" />

paulomorgado avatar May 23 '24 14:05 paulomorgado

In my case, I only have 1 instance of the application, however, I realized that when I added structured log writing from Serilog to OpenTelemetry this was causing my problem. When I informed the parameter autoGenerateServiceInstanceId = false the problem was resolved.

clt-pereira avatar May 23 '24 14:05 clt-pereira

It then appears that there is still an issue with instance id stability. @JamesNK since filed a bug report against otel on this issue, have you encountered it since they released a fix in 1.7?

adamint avatar May 23 '24 14:05 adamint

#1411 should be completed at the same time as this. DCP gives AppHost owner information, which we can use to construct accurate replica sets.

adamint avatar Jun 19 '24 15:06 adamint

Hello, I have the same problem, but coming from different sources. I would like to always use service names and see service instance only as extended information (tooltip etc.). image

We use OTEL Collector to collect telemetry from our microservices run by docker compose in production. The collector then distributes received telemetry to extra export files, Grafana (loki/tempo/mimir) and standalone Aspire Dashboard.

In order to analyze an application problem offline, we create a Saved Status - composite zip archive with database dump, console and file logs of some external services and telemetry file exports from the collector.

When restoring Saved Status on another machine to analyze it on another application instance, we process telemetry files, add extra properties to telemetry records to easily find them, extend Aspire Dashboard limits and Grafana retention periods to fit the imported telemetry data, put the processed telemetry files to Collector`s import folder. The collector then imports it in background to Aspire Dashboard and Grafana playing it back like in production.

Because telemetry data includes many application runs, we have many service instance identifiers, but what really matters is filtering by service name and Saved Status keys. image Service instance ids in Structured Logs, Traces and Metrics combo box just clutter and complicate usage.

I will add the problem and others related blocking proper usage to the summary list for our historical telemetry analysis case:

  • Service name is enough and required for telemetry filtering, service instance is good to know, but optional, and clutter usage in combo box selections and telemetry record lines
  • No custom filtering by record attributes in Traces and Metrics
  • No custom time interval selection filtering in all tabs
    • Metrics times limited to 12 hours only: image

We have added Aspire Dashboard for simpler telemetry analysis then in Grafana. We have custom dashboards in Grafana also, but doing advanced analysis requires advances query language (*QL) skills which is too much for many people including service men. Aspire Dashboard is promising for our case already now and when extended to address the above issues, it will be even better.

Thank You for the tool :-)

rolfik-mycronic avatar Jul 22 '24 07:07 rolfik-mycronic

@kvenkatrajan @JamesNK ^

adamint avatar Jul 22 '24 13:07 adamint

This is improved in the next Aspire version.

Just seeing a GUID service instance id isn't a good experience. When there are multiple instances of telemetry for a service, name now combines the service name with the first characters of the service instance id.

For example, if there are multiple instances of lineconfiguration, you'll see:

  • lineconfiguration-c39acab
  • lineconfiguration-784a4ed

I think that addresses the problem here.

JamesNK avatar Jul 22 '24 14:07 JamesNK

I see Aspire 8.1 is out, but I do not see Aspire Dashboard standalone container image for that version. 8.0.2 still has the problems.

rolfik-mycronic avatar Jul 29 '24 08:07 rolfik-mycronic

@joperezr When will 8.1 of the dashboard be published?

JamesNK avatar Jul 29 '24 09:07 JamesNK

After https://github.com/dotnet/dotnet-docker/pull/5732 gets merged it shouldn't be long before we have the image available. We already have a nightly image available: https://mcr.microsoft.com/en-us/product/dotnet/nightly/aspire-dashboard/about

joperezr avatar Jul 29 '24 18:07 joperezr

I have tested Aspire Dashboard 8.1.

  • Showing app name with id is ok
  • I still see (replica set) in all tabs filter even if my log records have nothing to do with replicas. They are simply recorded from multiple subsequent app executions image
  • I cannot simply select app name to filter all app instances with the name, which I want, but I am forced to select specific instance which I do not need

rolfik-mycronic avatar Aug 01 '24 10:08 rolfik-mycronic

have tested Aspire Dashboard 8.1.

  • These changes have not yet been released. Please see the linked pull request. The text (replica set) has been completely removed, which is how you can know you are using an older version of Aspire.
  • This is intended - you can control the OTEL application naming if you wish for them to all show up under the same name.

adamint avatar Aug 01 '24 12:08 adamint

I cannot simply select app name to filter all app instances with the name, which I want, but I am forced to select specific instance which I do not need

This isn't supported in 8.1. I created an issue for supporting this in the future: https://github.com/dotnet/aspire/issues/5137

JamesNK avatar Aug 01 '24 14:08 JamesNK