opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

[exporter/Loki] Attributes and Resources has a attributes_ and resources_ prefix in Loki.

Open kago-dk opened this issue 2 years ago • 2 comments

What happened?

## Description
I am trying to take advantage of the “new” Loki Exporter in OTEL collector v0.60.0. Do I include all my resources and attributes from the filelog receiver as either loki.attribute.labels or loki.resource.labels? The answer is no based on this article https://grafana.com/blog/2020/08/27/the-concise-guide-to-labels-in-loki/. 

If not, how do I avoid them showing up with the prefix of attributes (attributes_exception_stack_trace) or the prefix resources (resources_process_pid) in Loki.

## Steps to Reproduce
Test using the testfile.log and otel-config.yaml

## Expected Result
I am trying to avoid the attributes_ and resources_ shown in the (dark) screenshot. Even the hardcoded values with example_http_host_name and example_pod_name end up with the prefix in Loki (dark screenshot).

Collector version

v0.60.0

Environment information

## Environment
OS: Windows Server 2019

OpenTelemetry Collector configuration

# Receivers
receivers:
  filelog:
    include:
    - "C://temp//testfile*.log"

    operators:
    - type: json_parser
      timestamp:
        parse_from: attributes.TimeStamp
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
      severity:
        parse_from: attributes.LogLevel
    - type: move
      id: Resource1
      from: attributes.Resource1
      to: resource["host_name"]
    - type: move
      id: Resource2
      from: attributes.Resource2
      to: resource["process_pid"]
    - type: move
      id: Resource3
      from: attributes.Resource3
      to: resource["process_executable_name"]
    - type: move
      id: Resource4
      from: attributes.Resource4
      to: resource["host_ip"]
    - type: move
      id: Resource5
      from: attributes.Resource5
      to: resource["service_version"]
    - type: move
      id: Attribute1
      from: attributes.Attribute1
      to: attributes["http_method"]
    - type: move
      id: Attribute2
      from: attributes.Attribute2
      to: attributes["http_status_code"]
    - type: move
      id: Attribute3
      from: attributes.Attribute3
      to: attributes["http_url"]
    - type: move
      id: Attribute4
      from: attributes.Attribute4
      to: attributes["net_peer_ip"]
    - type: move
      id: Attribute5
      from: attributes.Attribute5
      to: attributes["error.code"]
    - type: remove
      field: attributes.TimeStamp
    - type: remove
      field: attributes.LogLevel
      
# Processors
processors:
  batch:
  attributes:
    actions:
    - action: insert
      key: loki.attribute.labels
      value: http_method,example_http_status_code
    - action: insert
      key: loki.resource.labels
      value: service_version      
    # http_method is a attribute and service_version is a resource.    
      
    # the following attributes are added manually here in the example, but would
    # probably be added by other processors or straight from the source
    - action: insert
      key: example_http_status_code
      value: 500
    - action: insert
      key: example_http_status
      value: 200      
  resource:
    attributes:
    - action: insert
      key: loki.attribute.labels
      value: example_http_status
    # example_http_status is a attribute  
    - action: insert
      key: loki.resource.labels
      value: host_ip,example_host_name,example_pod_name
    # host_ip,example_host_name,example_pod_name is a resource  
      
    # the following attributes are added manually here in the example, but would
    # probably be added by other processors or straight from the source
    - action: insert
      key: example_host_name
      value: guarana
    - action: insert
      key: example_pod_name
      value: guarana-pod-01      

# Exporters
exporters:
  logging:
    logLevel: debug
  loki:
    endpoint: https://localhost:3100/loki/api/v1/push
    tls:
      insecure: true

# Services
service:
  pipelines:
    logs:
      receivers: filelog
      processors: [attributes,resource]
      exporters: loki,logging
  telemetry:
    logs:
      output_paths: ["./logs/otelcol-contrib.log"]

Log output

No response

Additional context

kago-dk avatar Sep 19 '22 14:09 kago-dk

Pinging code owners: @jpkrohling.

kago-dk avatar Sep 19 '22 14:09 kago-dk

Pinging code owners: @gramidt @gouthamve @jpkrohling @kovrus @mar4uk. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Sep 19 '22 17:09 github-actions[bot]

I'm very interested in this too. I'm looking into reducing the resource footprint on the clusters and replacing Promtail with OTL receiver

@kago-dk Looks like from 0.59 onwards you need to map the attributes into loki.attributes.labels or loki.resources.labels https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/lokiexporter#labels

I'll be testing this within the next few weeks but I haven't tried yet

carlosjgp avatar Oct 17 '22 10:10 carlosjgp

(ignore my previous comment I was looking at the filelog exporter resources and attributes)

carlosjgp avatar Oct 18 '22 16:10 carlosjgp

I believe this to be a characteristic of the filelog, not Loki, but I'll take a look at this in a couple of weeks. In the meantime, would you be able to use the logging exporter with debug level? If this shows the same type of data there, we would have a confirmation that the problem isn't at the Loki exporter.

jpkrohling avatar Oct 19 '22 18:10 jpkrohling

Tried to reproduce the issue using testfile.log and config.yaml attached. But I can't reproduce it. I don't see prefixes in loki: Screenshot 2022-10-25 at 12 38 24

Does the issue still remain?

mar4uk avatar Oct 25 '22 11:10 mar4uk

I will retest using the latest OTEL Collector later this week.

kago-dk avatar Oct 25 '22 13:10 kago-dk

Using 63.0 and running the query "{example_http_status_code="500"} | json" gives me: image

And the verbosity: detailed (in 63) gives me: image

kago-dk avatar Oct 28 '22 01:10 kago-dk

Just adding that I have put a docker-compose example that can be used to recreate the addition of the attribute_ prefix in https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/15677#issuecomment-1293805580.

These issues may be semi-related if they both seem to be part of the logic that is used to convert OpenTelemetry attributes/resources into log entries.

disfluxly avatar Oct 28 '22 14:10 disfluxly

@kago-dk thank you! I see it now when I run query in json format: "{example_http_status_code="500"} | json" I did some investigation and found out that loki translator converts otlp log records into loki log records this way. lokiEntry struct type has attributes and resources nested fields, that's why loki entry is sent to loki with nested attributes and resources fields. https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/cc0c784a993a6d52ad1cb5e596fe0780ab1dd915/pkg/translator/loki/encode.go#L28-L36

When we run query with format |json loki converts logline fields to log labels by flattening field names. That's why we see prefixes. Not sure if it is a bug or feature @jpkrohling

mar4uk avatar Nov 09 '22 12:11 mar4uk

True, it happens here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/cc0c784a993a6d52ad1cb5e596fe0780ab1dd915/pkg/translator/loki/encode.go#L88-L97

This is on purpose, as we might have clashes between values under the same key at both the resources and record attributes. I'm not sure what would be the most appropriate for this situation. Perhaps use the key name, and the value for the most specific attribute (ie, record instead of resource)? In any case, this would be a breaking change that would need to be advertised well in advance.

jpkrohling avatar Nov 22 '22 19:11 jpkrohling

@kago-dk I have read through your description and comments once again. And the short answer is you can't get rid of attributes_, resources_ prefixes in loki when you search with | json parser. This is how loki works. There is more details about how parser in Loki works: https://grafana.com/docs/loki/latest/logql/log_queries/ the section JSON:

Adding | json to your pipeline will extract all json properties as labels if the log line is a valid json document. Nested properties are flattened into label keys using the _ separator.

For example, the json parsers will extract from the following document:

{
   ...
   "request": {
      "time": "6.032"
      "method": "GET",
   }
   ...
}

The following list of labels: "request_time" => "6.032" "request_method" => "GET"

Basically, the same thing is happening with log record attributes and resources:

{
   ...
   "attributes": {
      "error_code": "ABC"
      "http_status_code": 400,
   }
   ...
}

"attributes_error_code" => "ABC" "attributes_http_status_code" => 400

mar4uk avatar Dec 15 '22 11:12 mar4uk

I think it's very important to be able to add, change even remove labels from a Loki entry When running queries for Prometheus (PromQL), Loki (LogQL) and soon Tempo (TraceQL) when you split the screen on Grafana Explore you want to have matching labels

image

and even when you're not splitting the screen... having consistent labels to search by is just peace of mind for the developer that needs to find a bug It would be nice if we can control the labels that Loki is going to use to index each log entry.

This would be a blocker for me to adopt OTEL as a log shipper instead of Promtail I could take a look and raise a PR but not sure when

carlosjgp avatar Jan 06 '23 09:01 carlosjgp

You can use the transform processor to change the attributes for a given record. I believe the Loki exporter should just export the data it received from the pipeline.

About the specific problem of consistent labels among databases, I'm know that the other folks at Grafana are looking into ways to make this better. I believe there might even be a datasource mapping feature right now, which allows you to rename attributes on the fly. Ping me on the Grafana slack if you can't find a way to do it.

jpkrohling avatar Jan 25 '23 14:01 jpkrohling

Thanks for your response Juraci,

regardless of the dynamic mapping I still think it's important to keep consistent labels across all the datasources so we a developer wants to filter by pod they don't have to know 3 different labels

We are already using the mapping capacity to align Loki and Tempo (App -> OTEL -> Tempo) datasources but with the imminent addition of TraceQL on Grafana Tempo I'm looking at how to align these labels too

carlosjgp avatar Jan 27 '23 10:01 carlosjgp

regardless of the dynamic mapping I still think it's important to keep consistent labels across all the datasources so we a developer wants to filter by pod they don't have to know 3 different labels

Feedback noted, this is a pain we have right now and we want to solve this.

jpkrohling avatar Jan 31 '23 13:01 jpkrohling

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • exporter/loki: @gramidt @gouthamve @jpkrohling @kovrus @mar4uk

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Apr 03 '23 03:04 github-actions[bot]

@kovrus, @mar4uk, what's the current state of this one?

jpkrohling avatar Apr 04 '23 20:04 jpkrohling

Currently, there is no work is happening toward removing prefixes.

I think it's very important to be able to add, change even remove labels from a Loki entry

It is possible to do this with attributes, resource, transform processors. You can set pod label for Loki in the collector config and use it to match metrics and logs. More details on this topic could be found in lokiexporter readme. Please let me know if this approach doesn't work for you

regardless of the dynamic mapping I still think it's important to keep consistent labels across all the datasources so we a developer wants to filter by pod they don't have to know 3 different labels

There is flag resource_to_telemetry_conversion in exporter/prometheusremotewriteexporter that converts all resource attributes to labels for prometheus automatically. We are going to implement the same flag for the lokiexporter as well: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/19215 I think this also solves the issue of label consistency between loki and mimir. But this flag should be used very carefully (you should control the number of resource attributes) to avoid high cardinality issue.

mar4uk avatar Apr 12 '23 14:04 mar4uk

I agree. I think there's a reasonable workaround in a place that is more appropriate than the Loki exporter. I'm closing, but I'm still happy to hear counterarguments.

jpkrohling avatar Apr 12 '23 19:04 jpkrohling