loki icon indicating copy to clipboard operation
loki copied to clipboard

Removing field from json in Promtail before submitting to Loki

Open hterik opened this issue 4 years ago • 10 comments

Given a JSON log containing timestamp, i am recording it's timestamp using the timestamp stage, eg:

Input log line (from Grafana):

{"duration":"109.51947ms","logger":"migrator","lvl":"info","msg":"migrations completed","performed":330,"skipped":0,"t":"2021-09-30T13:01:14.136635479Z"}

promtail-config:

    pipeline_stages:
    - json:        
        expressions:          
          timestamp: t
    - timestamp:
        source: timestamp   
        format: RFC3339Nano

After this i want to remove the t-field, and only the t-field, nothing else. Since timestamp is already recorded, to avoid duplication inside Loki. This will reduce clutter in the UI when browsing logs and should also save some storage and bandwidth.

Describe the solution you'd like Roughly something like:

    pipeline_stages:
    - json:        
        expressions:          
          timestamp: t
          filtered_log: drop("t")        # <<< New imaginary function
    - timestamp:
        source: timestamp   
        format: RFC3339Nano
    - output:
       source: filtered_log             # <<< Use the log line no longer containing the t-field

Describe alternatives you've considered This has previously been discussed in https://github.com/grafana/loki/issues/1011 which got closed after implementing LogQL filters to show only message during query-time in the UI. As described above i think dropping redundant fields before ingesting them would be more useful

JMESPath has an open request for addition of a delete-function: https://github.com/jmespath/jmespath.py/issues/121

I guess one could also filter it out using regex in a replace-stage but for complicated fields such as datetimes this quickly becomes messy and you also risk missing quoting-differences or replacing sub-content inside the msg-field.

hterik avatar Oct 01 '21 06:10 hterik

This feature would be great, agree.

ruslantagirov avatar Nov 16 '21 18:11 ruslantagirov

I agree, the feature is very useful. Sometimes i need to delete half of json fields because they are useless

GadskyPapa avatar Nov 16 '21 18:11 GadskyPapa

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

  • Mark issues as revivable if we think it's a valid issue but isn't something we are likely to prioritize in the future (the issue will still remain closed).
  • Add a keepalive label to silence the stalebot if the issue is very common/popular/important.

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

stale[bot] avatar Jan 09 '22 13:01 stale[bot]

Please unstale.

hterik avatar Jan 10 '22 07:01 hterik

So I found a way to do that with templates:

    pipeline_stages:
      - json:
          expressions:
            level:
            ts:
      - labels:
          level:
      - template:
          source: message
          template: '{{ omit (mustFromJson .Entry) "ts" "level" | mustToJson }}'
      - output:
          source: message

This effectively creates a new message extraction, which is a JSON representation of the entry without the ts and level keys.

ngotchac avatar Mar 18 '22 12:03 ngotchac

@ngotchac I'd love to get this working, but have been unable to get better than '[inspect: template stage]: none' and completely unmodified output from the above. I've tried with 2.4.2 and 2.6.1 just in case something changed since you wrote.

Any pointers to other examples or documentation on what you're using in the template? I have found nothing on omit, or must[To|From]Json.

[Edit] As usual moments after I break down and post, I make some reasonable progress. Had to change .Entry to .Value. That aside, any thoughts on injecting new k:v pairs into the produced json?

alaricljs avatar Aug 01 '22 03:08 alaricljs

@ngotchac I'd love to get this working, but have been unable to get better than '[inspect: template stage]: none' and completely unmodified output from the above. I've tried with 2.4.2 and 2.6.1 just in case something changed since you wrote.

Any pointers to other examples or documentation on what you're using in the template? I have found nothing on omit, or must[To|From]Json.

[Edit] As usual moments after I break down and post, I make some reasonable progress. Had to change .Entry to .Value. That aside, any thoughts on injecting new k:v pairs into the produced json?

So the template methods are available since Loki 2.3 apparently, cf. https://grafana.com/docs/loki/latest/clients/promtail/stages/template/#supported-functions ; which links to https://masterminds.github.io/sprig/ where you'll find those methods. Regarding the Entry vs. Value, from the docs:

  • If source is available it can be referred as .Value in template
  • A special key named Entry can be used to reference the current line. So I guess it depends on the usage.

I'm not sure I get your question about injecting new k:v pairs. This snippet is to remove some k:v pairs

ngotchac avatar Aug 01 '22 08:08 ngotchac

I'm not sure I get your question about injecting new k:v pairs. This snippet is to remove some k:v pairs

It's the flip side of the same coin and something I need to do in order to fix disparities in the log output I'm stuck dealing with. It's mixed format where some very pertinent data is outside the JSON and I'd rather inject it than make labels from it. Thought perhaps you might have experience there to share. I'll dig through the docs that you linked (appreciated!) and see what I can find.

With a moment to spare I might even get the less useful docs here updated to reflect the same as what you shared.

alaricljs avatar Aug 01 '22 12:08 alaricljs

I'm not sure I get your question about injecting new k:v pairs. This snippet is to remove some k:v pairs

It's the flip side of the same coin and something I need to do in order to fix disparities in the log output I'm stuck dealing with. It's mixed format where some very pertinent data is outside the JSON and I'd rather inject it than make labels from it. Thought perhaps you might have experience there to share. I'll dig through the docs that you linked (appreciated!) and see what I can find.

With a moment to spare I might even get the less useful docs here updated to reflect the same as what you shared.

I see! So I guess you can make it work with a mix of pick merge and set?

ngotchac avatar Aug 01 '22 13:08 ngotchac

I see! So I guess you can make it work with a mix of pick merge and set?

Yes indeed, thank you! I just used set successfully and can get back to making things work.

For anyone that wants the shortcut version: json_str is the section of my log that is pure JSON accessed via .Value $foo is a throwaway variable as the template sets json_str to the resulting value of the template output type gets removed from the original JSON direction is a label from the original log string accessed via .direction

          - template:
              source: json_str
              template: '{{ $foo := omit (mustFromJson .Value) "type" }}{{ $foo := set $foo "direction" .direction }}{{ $foo | mustToJson }}'
          - labeldrop:
            - direction

alaricljs avatar Aug 01 '22 13:08 alaricljs

Sounds like there's an acceptable workaround :+1:

In addition, while we haven't made a formal decision yet, we expect in the near future that all new feature work will be done in the Agent's log collection pipelines rather than in Promtail. If a feature like this is still relevant please open a PR there.

cstyan avatar Nov 08 '23 01:11 cstyan

OT: @cstyan can you please share a link about what this "Agent" is? I was under the impression Promtail was considered an Agent :)

hterik avatar Nov 09 '23 11:11 hterik

@hterik here's the link, I meant to include that in my original comment, my apologies :)

cstyan avatar Nov 16 '23 00:11 cstyan

Wait a second...does this mean that all future work on Promtail is suspended? Will future development happen in Grafana agent? Or will it be fragmented? Is there any blog post or roadmap one can get some more background about this change? Migration guide? Compatibility matrix? Looking at the Agent repo and docs now it seems like a complete different project.

Sorry for derailing the conversation, don't expect a full explanation here, just some links would be good.

hterik avatar Nov 20 '23 07:11 hterik

Wait a second...does this mean that all future work on Promtail is suspended? Will future development happen in Grafana agent?

Like I said, we haven't made a formal decision yet but this is the most likely scenario in terms of new feature work. However, promtail would still get bug fixes and security patches. The Loki team is focused on making Loki itself a better piece of software, and internally we have teams building both the Grafana Agent and other integrations. They're dedicated to those things, and can provide better support and new features than we can for promtail.

Is there any blog post or roadmap one can get some more background about this change? Migration guide? Compatibility matrix? Looking at the Agent repo and docs now it seems like a complete different project.

Nothing yet, but as the Agent's log collection code is just a fork of promtail the migration and compatibility guide should be "they're the same thing just in different binaries". See here, using your current promtail config in a grafana agent config file should just work. That's the goal I have at the moment, to ensure that there's no missing features or bug fixes in the agent since their fork of promtail.

cstyan avatar Nov 20 '23 17:11 cstyan