json_exporter icon indicating copy to clipboard operation
json_exporter copied to clipboard

Support Value Conversions

Open ngrebels opened this issue 2 years ago • 4 comments

These changes allow users to define value mappings within the config file. By defining these mappings, values that are strings can be converted into a number.

Included:

  • tests
  • support for 'valueconverter' key in config files

ngrebels avatar Aug 10 '22 14:08 ngrebels

This feels like a happy middle ground for issues with strings/nulls. Fingers crossed it looks good! 😄

liam-hogan avatar Aug 11 '22 20:08 liam-hogan

Can a maintainer approve this pull request? We have a deadline here. It would be best if we can get an approval by Wednesday?

yaohongkok avatar Aug 15 '22 16:08 yaohongkok

@yaohongkok I am not sure if we should hurry a change in case there's an unseen breakage (but I don't have any say other than being an end-user). All that said, @SuperQ what is the likelihood of this PR being merged?

liam-hogan avatar Aug 16 '22 12:08 liam-hogan

@liam-hogan Thanks for the reply. I understand the concern of introducing breaking changes. We would still appreciate it if we can get a code review soon. If it is not possible, then we will just wait like everyone else.

We have tested it internally and things seems to work fine. I think the tests are passing too.

yaohongkok avatar Aug 16 '22 13:08 yaohongkok

Hey @rustycl0ck @SuperQ

What is the likelihood for this PR as well as #167 getting merged into master?

liam-hogan avatar Aug 25 '22 12:08 liam-hogan

Looks like there are some naming convention issue. I will go update the forked repo next week to conform to the linter.

yaohongkok avatar Aug 25 '22 20:08 yaohongkok

I have fixed the linting issue.

Hopefully, the reviewers can take a look at it.

Also, the merge conflict basically involves merging both code together. Some manually typing would be involved. Unfortunately, I have no write access so I cannot perform the merge.

yaohongkok avatar Aug 29 '22 10:08 yaohongkok

You will need to perform the merge locally on your fork and force-push your branch. I recommend syncing your master branch and doing git rebase -i origin/master.

SuperQ avatar Aug 29 '22 12:08 SuperQ

Oh, I see, @ngrebels needs to fix up the branch.

SuperQ avatar Aug 29 '22 12:08 SuperQ

You will need to perform the merge locally on your fork and force-push your branch. I recommend syncing your master branch and doing git rebase -i origin/master.

Oh, you are right. Let me give this a try to see if it will work.

yaohongkok avatar Aug 29 '22 13:08 yaohongkok

I have made all the fix. Hopefully @rustycl0ck and @sysadmind can perform the code review?

yaohongkok avatar Aug 30 '22 06:08 yaohongkok

@rustycl0ck Can you review it when you are free?

yaohongkok avatar Sep 06 '22 12:09 yaohongkok

Hi @rustycl0ck were you able to review this change?

liam-hogan avatar Sep 19 '22 18:09 liam-hogan

@yaohongkok and @sysadmind : Not sure if there's anything wrong with the waiting approval? What can be done to get this moved forward?

liam-hogan avatar Sep 26 '22 21:09 liam-hogan

@liam-hogan I think my understanding is @rustycl0ck need to approve this before it can be merged. Maybe I am wrong?

I have check with ngrebls but it seems like he cannot finish of the pull request too.

Can @sysadmind force pull this pull request?

yaohongkok avatar Sep 26 '22 21:09 yaohongkok

I believe that @SuperQ is the one to merge. I don't have access on this repo.

Ping. :)

sysadmind avatar Oct 04 '22 00:10 sysadmind

I will try to ping @SuperQ.

yaohongkok avatar Oct 05 '22 16:10 yaohongkok

@SuperQ please hold on, don't release this yet. Sorry for being late but I have some questions regarding this PR. Also, it seems to be buggy if the sample config uses path: '{.values[*]}' instead of path: '{.values[0,1]}'

rustycl0ck avatar Oct 06 '22 00:10 rustycl0ck

Sorry, I overlooked this PR. I will set up a periodic schedule for me to go through issues and PRs for this repo in future.

However, for the people involved with the use case for this PR (@ngrebels, @liam-hogan, @yaohongkok): Can you please clarify what use case does this PR solve? Is just setting a static value not sufficient for extracting enum type of values? (instead of converting those enum values to numbers, just set some or all values to the desired numbers you want).

Example

Example config:

modules:
  default:
    metrics:
    - name: example_state
      type: object
      help: Example of counting different states
      path: '{.values[*]}'
      labels:
        id: '{.id}'
        state: '{.state}'
      values:
        activity: 18      # static value

    - name: example_convert_original
      type: object
      help: Example of assigning a static value to INACTIVE state
      path: '{.values[?(@.state == "INACTIVE")]}'
      labels:
        id: '{.id}'
        state: '{.state}'
      values:
        state: 17      # static value

    - name: example_convert_original
      type: object
      help: Example of assigning a static value to INACTIVE state
      path: '{.values[?(@.state == "ACTIVE")]}'
      labels:
        id: '{.id}'
        state: '{.state}'
      values:
        state: 19      # static value

    - name: example_convert
      type: object
      path: '{.values[0,1]}'
      labels:
        state: '{.state}'
      values:
        state: '{.state}'
      valueconverter:
        '{.state}': #convert value 'state' in JSON into a number
          active: 29
          inactive: 23

Example data:

{
    "counter": 1234,
    "timestamp": 1657568506,
    "values": [
        {
            "id": "id-A",
            "count": 1,
            "some_boolean": true,
            "state": "ACTIVE"
        },
        {
            "id": "id-B",
            "count": 2,
            "some_boolean": true,
            "state": "INACTIVE"
        },
        {
            "id": "id-C",
            "count": 3,
            "some_boolean": false,
            "state": "ACTIVE"
        }
    ],
    "location": "mars"
}

Response:

$ curl "http://localhost:7979/probe?module=default&target=http://localhost:8000/examples/data.json"
# HELP example_convert_original_state Example of assigning a static value to INACTIVE state
# TYPE example_convert_original_state untyped
example_convert_original_state{id="id-A",state="ACTIVE"} 19
example_convert_original_state{id="id-B",state="INACTIVE"} 17
example_convert_original_state{id="id-C",state="ACTIVE"} 19
# HELP example_convert_state example_convert
# TYPE example_convert_state untyped
example_convert_state{state="ACTIVE"} 29
example_convert_state{state="INACTIVE"} 23
# HELP example_state_activity Example of counting different states
# TYPE example_state_activity untyped
example_state_activity{id="id-A",state="ACTIVE"} 18
example_state_activity{id="id-B",state="INACTIVE"} 18
example_state_activity{id="id-C",state="ACTIVE"} 18

Example metrics usage: Screen Shot 2022-10-06 at 10 14 31

Reproducing your use case without this PR: From this example, you can see that I am able to receive the same result you expect (a static number value for when state="ACTIVE" and state="INACTIVE") in the metric example_convert_original_state (which has identical behavior to example_convert_state). But as you can see, I did not use the new feature added in this PR for example_convert_original_state, which made me question if we really need this PR or not.

An alternative to your use case: Also as you can see from the screenshot of the metric usage example, you can just use the prometheus count by and sum by functions in your query to get the information about how many metrics of what kind of state were collected. Instead of assigning a custom random number, just assigning 1 (in place of 18 in the example) would also be useful if you don't really want to differentiate between the sum by and count by values. This is a better approach in my opinion because I don't see much utility in assigning different random values to different kinds of states (29 for ACTIVE and 23 for INACTIVE in above example). I say this because the these values are static so doesn't make much sense to keep them different. At the time of consumption, you can always filter by the label value of state instead of by value of the metric.

Minor problem with the implementation of this PR: In the example config added in this PR, I can see path: '{.values[0,1]}'. This means that the user already knows that the first and second metrics in that array will always have state=ACTIVE and state=INACTIVE (in any order). But they will never be the same, because then both the prometheus metrics become identical, both example_convert_state{state="ACTIVE"} 1 (or both example_convert_state{state="INACTIVE"} 2), which is not allowed by the exporter and it throws the error:

An error has occurred while serving metrics:

collected metric "example_convert_state" { label:<name:"state" value:"ACTIVE" > untyped:<value:1 > } was collected before with the same name and label values

This means that the example config added in this PR, is equivalent to adding following two static metrics in each scrape (irrespective of any conditions, the following two metrics will always be scraped or else there will be an error):

example_convert_state{state="ACTIVE"} 1
example_convert_state{state="INACTIVE"} 2

This makes these metrics useless in my opinion because they are "always" present and with the exact same values. So they contain no information at all. But still if you want this behavior, it can be achieved by adding static metric values in the config file already before this PR was merged.

Again, sorry for the delay in reviewing. I hope I will be able to better manage my schedule in future.

rustycl0ck avatar Oct 06 '22 01:10 rustycl0ck

@rustycl0ck Ok. I think the problem here was the documentation was not completely clear. Until you showed me the example, I thought that this was not possible at all.

I will give your examples a try.

yaohongkok avatar Oct 06 '22 15:10 yaohongkok

@rustycl0ck To explain my use case, it has been discussed at #76 .Specifically, I have values that are returning as "". The json will also sometimes have a metric value. So when they are "", it would be beneficial to do something about them, as the errors can become cumbersome. When running this service, the errors can quickly fill up and kill whatever system you are running it on unless you are careful since it basically spits out the whole module as an error.

My json format is described in #168

{
    "title": "JSON data",
    "timestart": "07/21/2022 13:53:20",
    "timeend": "07/21/2022 13:55:00",
    "table": [
        {array: value,
          of: value,
          items: value},
        {array: value,
          of: value,
          items: value},
        {array: value,
          of: value,
          items: value}]
}

Really what I am targeting is the values in .table[*]. I will have situations where both of: "" and of: "3.5" (as well as other metric values will happen.

Additionally, I have tried your method where '{.table[?(@.of == "")]}'. This works, but where this breaks down is that any of the values under .table[*] can have a value of "". Is there a proper way to cleanly handle this? Because right now it looks that I will have to make an individual module for each element under .table[*] which as of right now is 34 elements.

I have also tried just ignoring null values and letting Grafana deal with it, but I run into the same issue.

liam-hogan avatar Oct 06 '22 21:10 liam-hogan

@rustycl0ck so, I was able to resolve this. Long story short, yes: each individual metric requires a separate name block.

So, I was able to make sure there were no errors by filtering out the null values on individual metrics this way. This means each name section basically looks like so:

modules 
  module_name:
    metrics:
    - name: Metric#
      type: object
      help: Metric# test
      path: '{.table[?(@.${Metric#_Name} != "")]}'
      labels:
        Metric_ID: '{.${Metric_ID}}'
      values:
        Metric#: '{.${Metric#_Name}}'

liam-hogan avatar Oct 11 '22 13:10 liam-hogan

After looking more closely at @rustycl0ck suggestion and comparing it with our previous pull request, I do see more benefits of adopting our approach to map values. The key benefit is that the value mapping process is more succinct and less code duplication.

Let's say I have a field call 'status' and it has 4 possible values. These means that I need 4 entries in the config.yml file. Notices that 'labels' field is being duplicated 4 times. Some apps may need different labels but I feel that there should be an option to avoid repeating labels frequently.

If no one objects, can I create an issue for this?

yaohongkok avatar Oct 20 '22 20:10 yaohongkok