promitor icon indicating copy to clipboard operation
promitor copied to clipboard

feat: Provide capability to use multiple metric dimensions

Open aaronweissler opened this issue 2 years ago • 5 comments

Fixes #1820

Will add a few comments and open questions to lines.

aaronweissler avatar Sep 12 '22 13:09 aaronweissler

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 12 '22 13:09 CLAassistant

  • Add a sample configuration to the metric declaration to verify that this works at runtime please?

Is the change in https://github.com/tomkerkhove/promitor/pull/2123/commits/4a8740da5f3ba031be6361fdfd8b6a656318dd2c all that is needed for that or do I have to add something else, too?

aaronweissler avatar Sep 16 '22 11:09 aaronweissler

Do you have an idea what could have broken those integration tests? Since it is all related to ResourceDiscovery, which shouldn't be affected by any changes herein, or is it?

aaronweissler avatar Sep 16 '22 11:09 aaronweissler

  • Add a sample configuration to the metric declaration to verify that this works at runtime please?

Is the change in 4a8740d all that is needed for that or do I have to add something else, too?

Nope, should be fine!

Do you have an idea what could have broken those integration tests? Since it is all related to ResourceDiscovery, which shouldn't be affected by any changes herein, or is it?

It's not you, it's me. I did some cleaning up in my Azure subscription and have aligned the expectations (straight to master branch by accident, it's Friday anyway 🤷 )

tomkerkhove avatar Sep 16 '22 12:09 tomkerkhove

Sorry for the poke but I'm just checking expectations - I think you're done with the first round of feedback and waiting for another review, is that correct?

Sorry for being slow here!

tomkerkhove avatar Sep 25 '22 11:09 tomkerkhove

Sorry for being slow here!

No worries at all :) Both in general, but also I'm kinda busy right now, too, anyways. Apart from the Where to Distinct issue and the code style warnings from the tests, I think I finished all open to-dos I had noted down. I might only get to fixing those two in about two weeks though, because of another deadline of mine coming up.

aaronweissler avatar Oct 03 '22 12:10 aaronweissler

Let me know when we are good for another review.

tomkerkhove avatar Oct 24 '22 06:10 tomkerkhove

Sorry for the slow response here, I was out for a while on vacation

tomkerkhove avatar Nov 03 '22 08:11 tomkerkhove

Looks like scraping works:

2022-11-14T12:20:55.5545952Z [11:45:30 INF] Scraping promitor_demo_application_insights_availability_per_name_and_location for resource type ApplicationInsights. [...] 2022-11-14T12:20:55.5564644Z [11:45:32 INF] Found value 100 for metric promitor_demo_application_insights_availability_per_name_and_location with dimension values ["Availability Promitor", "West Europe"] as part of the dimensions ["availabilityResult/name", "availabilityResult/location"] with aggregation interval 00:05:00

Did you manage to run them locally?

tomkerkhove avatar Nov 14 '22 12:11 tomkerkhove

Looks like scraping works:

Yes, with the new metrics it does! :) Writing them to Prometheus too. They can be seen in the Show Prometheus metrics for Scraper agent logs and the dimensions integration test now works as well.

Did you manage to run them locally?

The integration tests, no. Unfortunately, I don't have access to any Azure account right now, so I also don't have any PROMITOR_AUTH_APPID to provide as described in the contributing guide.

Now the last error is with the OpenTelemetry sink, which I assume is also causing the still really long runtime of the integration tests with its retries, but I guess I'd only find that out after fixing it. (Edit: just looked into the retry-line again, the long execution time should be from retrying, since the backoff adds up to just about the 34 minutes the tests ran for)

I don't know why the test breaks though, because as far as I understand the code, it should still write the metric and its values to the OpenTelemetry collector, just without labels containing the values for the dimensions. But instead, the metric using dimensions apparently doesn't exist in the OpenTelemetry Collector at all as can be seen in these logs. Do you have an idea, what could be breaking the writing of values in the OpenTelemetry sink?

Adding to this, the OpenTelemetrySink is not adding any label with the dimension value even for the previously existing single-dimension case, is it? Or am I missing something? If so, is that intentional and I should not add labels with the values for multiple dimensions either, or should I add that functionality like in the Prometheus sink's DetermineLabels function?

aaronweissler avatar Nov 14 '22 14:11 aaronweissler

I'd need to check but AFAIK the labels for dimensions were added; otherwise the reported metric would not be valid :/

tomkerkhove avatar Nov 14 '22 14:11 tomkerkhove

@aaronweissler Probably you would like some help to troubleshoot this I presume?

tomkerkhove avatar Nov 22 '22 11:11 tomkerkhove

Probably you would like some help to troubleshoot this I presume?

Yes, I just don't see the reason why only the newly added metric is not added to OpenTelemetry.

Also still wondering about this:

Adding to this, the OpenTelemetrySink is not adding any label with the dimension value even for the previously existing single-dimension case, is it? Or am I missing something?

If this already happens somewhere, could you point me to where, so I can add the labels for multiple dimensions as well. If it is does not currently, I could add it similarly to how it works in the Prometheus sink?

aaronweissler avatar Nov 27 '22 17:11 aaronweissler

I see it gets scrapped correctly:

[19:56:16 INF] Found value 100 for metric promitor_demo_application_insights_availability_per_name_and_location with dimension values ["Availability Promitor Docs", "Southeast Asia"] as part of the dimensions ["availabilityResult/name", "availabilityResult/location"] with aggregation interval 00:05:00

But it's not being reported as a metric outcome so I'll have a closer look tomorrow.

tomkerkhove avatar Nov 28 '22 20:11 tomkerkhove

Adding to this, the OpenTelemetrySink is not adding any label with the dimension value even for the previously existing single-dimension case, is it? Or am I missing something?

If this already happens somewhere, could you point me to where, so I can add the labels for multiple dimensions as well. If it is does not currently, I could add it similarly to how it works in the Prometheus sink?

This seems to be the root-cause of the problem for which I'll open a PR seperately first for #2180

tomkerkhove avatar Nov 29 '22 09:11 tomkerkhove

Any news on this PR ? Unfortunately I cannot contribute at this point due to my coding skills, but I'lll be happy to test on 1k+ metrics and 5k+ dimensions

romainmeunier avatar Mar 01 '23 00:03 romainmeunier

Thanks for that!

This is just depending on the other PR that is open to fix dimensions in other sinks which should close soon.

tomkerkhove avatar Mar 01 '23 11:03 tomkerkhove

You have successfully added a new Trivy configuration .github/workflows/ci-container.yml:scan. As part of the setup process, we have scanned this repository and found no existing alerts. In the future, you will see all code scanning alerts on the repository Security tab.

I have been stalling the release for too long with my other PR so will add this to 2.10

tomkerkhove avatar Mar 23 '23 18:03 tomkerkhove

I have been stalling the release for too long with my other PR so will add this to 2.10

yes please

thiDucTran avatar Jun 16 '23 18:06 thiDucTran

Ok so I was finally able to fix https://github.com/tomkerkhove/promitor/issues/2180. @aaronweissler Are you still up for this contribution and merge conflicts that I caused or do you prefer if somebody else takes over?

tomkerkhove avatar Jul 17 '23 08:07 tomkerkhove

I will be happy to help with that if needed, looking forward for this feature to go in 😄

amirschw avatar Jul 18 '23 09:07 amirschw

I would like to do it myself, but I'm not sure when I could get to it, so @amirschw I wouldn't mind your help taking over 🙂

aaronweissler avatar Jul 18 '23 17:07 aaronweissler

Thanks @aaronweissler! I sent a pull request for merging master to your feature branch, it would be great if you could review it and continue with this PR: https://github.com/Patrick-Eichhorn/promitor/pull/1

LMK If you don't think you'll have time for that, and I'll create a new one straight to tomkerkhove/promitor.

amirschw avatar Jul 18 '23 18:07 amirschw

@aaronweissler @amirschw Let me merge this to a branch on tomkerkhove/promitor so we don't need to disturb Aaron too much.

Thanks for all the hard work Aaron!

tomkerkhove avatar Jul 19 '23 06:07 tomkerkhove

Ok looks like we need a merge first anyway :(

tomkerkhove avatar Jul 19 '23 06:07 tomkerkhove

hmmm.... should I send a new PR then to tomkerkhove/promitor?

amirschw avatar Jul 19 '23 08:07 amirschw

That or do a fix on the current branch to resolve the conflicts

tomkerkhove avatar Jul 19 '23 09:07 tomkerkhove

Done through https://github.com/tomkerkhove/promitor/pull/2351 based on @aaronweissler's hard work - Thanks a ton!

tomkerkhove avatar Jul 29 '23 18:07 tomkerkhove