krr icon indicating copy to clipboard operation
krr copied to clipboard

Add support for DataDog metrics instead of Prometheus

Open andrew-kolesnikov opened this issue 2 years ago • 18 comments

Our organization no longer uses Prometheus, so I am very curious what would it take to integrate with DataDog as a metrics source, either as what they call an "integration" or just using plain old DataDog API.

andrew-kolesnikov avatar Apr 20 '23 16:04 andrew-kolesnikov

It shouldn't be too hard. Is this something you'd be interested in taking a shot at yourself?

aantn avatar Apr 20 '23 16:04 aantn

Would love to contribute this from us, but it's looking like no one on my team will have any capacity for it in the next few months

andrew-kolesnikov avatar Apr 20 '23 17:04 andrew-kolesnikov

@aantn Maybe we can add DataDog metrics together with Thanos integration? (#18) Should not be that hard, WDYT?

LeaveMyYard avatar Apr 24 '23 07:04 LeaveMyYard

I have created something like this in the past, I think I should be able to add DataDog support to this project. Let me know if I could get this assigned so I can commit some of my time to contribute.

francRang avatar Jun 20 '24 05:06 francRang

100% yes!

Can you do it on top of the prometheus-workload-loader branch? There are some fairly major changes in that branch to the internal architecture, and we plan to merge it soon. So best to base your work off of that.

To the degree that you will need to change internal interfaces/abstractions to support this, we are open to it! We initially only had Prometheus in mind, so there might be some changes needed here, which is completely fine.

aantn avatar Jun 20 '24 06:06 aantn

100% yes!

Can you do it on top of the prometheus-workload-loader branch? There are some fairly major changes in that branch to the internal architecture, and we plan to merge it soon. So best to base your work off of that.

To the degree that you will need to change internal interfaces/abstractions to support this, we are open to it! We initially only had Prometheus in mind, so there might be some changes needed here, which is completely fine.

Sounds good, appreciate the fast response.

francRang avatar Jun 20 '24 14:06 francRang

Of course, can't wait to see what you do.

On Thu, Jun 20, 2024, 17:26 Francisco Rangel @.***> wrote:

100% yes!

Can you do it on top of the prometheus-workload-loader branch? There are some fairly major changes in that branch to the internal architecture, and we plan to merge it soon. So best to base your work off of that.

To the degree that you will need to change internal interfaces/abstractions to support this, we are open to it! We initially only had Prometheus in mind, so there might be some changes needed here, which is completely fine.

Sounds good, appreciate the fast response.

— Reply to this email directly, view it on GitHub https://github.com/robusta-dev/krr/issues/9#issuecomment-2180849642, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYUBZR6IAHOIJ7677GQ5LZILRCXAVCNFSM6AAAAABJTHHZTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBQHA2DSNRUGI . You are receiving this because you were mentioned.Message ID: @.***>

aantn avatar Jun 20 '24 14:06 aantn

I've started dedicating some time to PR. Will push commits to a branch that I will request to be reviewed so I can merge to prometheus-workload-loader and then that branch can eventually be merged into main.

The OP mentioned DataDog Integrations, from what I've read from the documentation, the so called integrations would be a bit overkill since they are meant to pull/push data, and we only want to pull (link to api integrations states the same thing).

So the approach I am taking will be hitting the DataDog API via the existing DataDog Python client.

As far as integrating this to krr, I think I will create a boolean flag --datadog that will basically route all of the existing parameters to be used with DataDog and not Prometheus, given that we'll only have 2 options and not > 2 for the metrics source (only the flags/options that make sense, e.g: if you set a prometheus URL that will be ignored, or maybe the CLI call will be denied so you only set things that make sense, e.g: --cpu-min), so putting all of the logic under core/integrations/datadog makes sense.

Should be simpler than the Prometheus implementation since DataDog is centralized, the only thing will be making sure the current cluster you're in has metrics in DataDog you're querying.

Anyways, just wanted to put my approach out there to keep y'all in sync. Let me know if it makes sense. Any feedback would be appreciated.

francRang avatar Jun 21 '24 05:06 francRang

Yes, makes total sense. There's a question regarding the interaction with the new --mode flag that controls how resources are discovered. If you set --datadog are resources is the cluster (e.g. list of deployments) still discovered via kubectl or is it discovered via DataDog?

The latter is probably better in which case setting --datadog probably ignored the mode flag or automatically enables a DataDog discovery mode. But if it is simpler to implement, you can leave discovery to kubectl and only worry about getting the metrics for recommendations from DD.

aantn avatar Jun 21 '24 06:06 aantn

Thinking getting the list of resources via kubectl then just querying the metrics might be easier, that way I don't have to write queries that specifically look for k8s resources, but just get the CPU/Mem values/datapoints.

francRang avatar Jun 21 '24 14:06 francRang

That works, sounds good.

On Fri, Jun 21, 2024, 17:58 Francisco Rangel @.***> wrote:

Thinking getting the list of resources via kubectl then just querying the metrics might be easier, that way I don't have to write queries that specifically look for k8s resources, but just get the CPU/Mem values/datapoints.

— Reply to this email directly, view it on GitHub https://github.com/robusta-dev/krr/issues/9#issuecomment-2182915647, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYUBYWJPR63R3TIEBCT6TZIQ5PVAVCNFSM6AAAAABJTHHZTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBSHEYTKNRUG4 . You are receiving this because you were mentioned.Message ID: @.***>

aantn avatar Jun 21 '24 15:06 aantn

WIP: https://github.com/francRang/krr/tree/add-datadog-support-as-metric-source 🐶

  • [x] Implement Auth to Datadog
  • [x] Check for credentials (if needed)
    • [x] API
    • [x] APP
    • [x] URL
  • [x] Test connection is valid
  • [x] Add —datadog boolean flag
  • [ ] Pass all parameters to —datadog flag
  • [ ] Evaluate whether or not given flag/value is valid for —datadog
  • [ ] Integrate queries with given resources from kubectl
  • [ ] Integrate any existing logic with --datadog flag

francRang avatar Jun 22 '24 07:06 francRang

I love this feature.

changhyuni avatar Jun 25 '24 10:06 changhyuni

Haven't had much time to work on this as of this month. Things should clear off soon. Will try to commit some time to it this week. Thanks y'all.

francRang avatar Jul 25 '24 04:07 francRang

No problem, thank you for the update!

On Thu, Jul 25, 2024, 07:10 Francisco Rangel @.***> wrote:

Haven't had much time to work on this as of this month. Things should clear off soon. Will try to commit some time to it this week. Thanks y'all.

— Reply to this email directly, view it on GitHub https://github.com/robusta-dev/krr/issues/9#issuecomment-2249340338, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADYUBY4NUVOKFMNZNBC3RTZOB3DJAVCNFSM6AAAAABJTHHZTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBZGM2DAMZTHA . You are receiving this because you were mentioned.Message ID: @.***>

aantn avatar Jul 25 '24 04:07 aantn

Hey @francRang, any updates on this? Would love to be able to use KRR with Datadog.

aaroncommify avatar Oct 17 '24 14:10 aaroncommify

@aaroncommify I feel like I overestimated the complexity of this PR. Maybe it is my lack of experience with Kubernetes components written in Python (I've used Go mainly in the past). Feel free to put it back for grabs.

francRang avatar Oct 23 '24 17:10 francRang