falco icon indicating copy to clipboard operation
falco copied to clipboard

Brainstorming: integrate Prometheus /metrics handler in Falco

Open Dentrax opened this issue 3 years ago • 17 comments

Motivation

Since falco-exporter is a great tool to expose prometheus metrics through gRPC, I think there are some caveats to use it. Our (w/ @developer-guy @f9n) some motivations to create this issue is that the following ones:

  • To reduce unnecessary gRPC communication: we have to enable gRPC output feature in the configuration. ^1
  • To do not manage mTLS certs: to connect gRPC over unix socket, we have to manage additionally CA Certs as described under connection options
  • To manage one single helm chart: there are two different helm charts that we have to manage and maintain in our infrastructure, which are falco and falco-exporter
  • To reduce complexity: simplicity is always better, to easily see the metrics in the Grafana, requires additional work to implement in current design
  • To make monitoring easier on local: we can instantly check the metrics using /metrics endpoint in some troubleshooting scenarios
$ kubectl port-forward svc/falco 8756
$ curl localhost:8756/metrics

I think that implementing a Prometheus metrics in C++ would not be as easy as it looks like. Here is an example metric server that currently using in fluentbit.

Feature

This feature already clearly proposed in #421 and #530 and implemented in the projects.

Alternatives

  • https://github.com/falcosecurity/falco-exporter
  • https://github.com/falcosecurity/falcosidekick#prometheus

Additional context

By throwing this issue actually does NOT mean that we should achieve the falco-exporter and provide this metrics in only falco by built-in. What I want to say mostly are the following ones to clarify and understand the design:

  • How we can make this process simpler by removing unnecessary dependencies?
  • What is the correct way to do?
  • Why did not falco provide /metrics endpoint in the first place?
  • What is the motivation of behind this design: separating metrics module into another project?

Waiting your feedback!

Dentrax avatar Nov 03 '21 12:11 Dentrax

Kind ping here 🎗️

Dentrax avatar Jan 14 '22 12:01 Dentrax

We can take a look Fluent Bit's cmt_gauge.h for C++ implementation.

Dentrax avatar Feb 28 '22 07:02 Dentrax

FYI Falcosidekick exposes prom metrics exactly like you want, it can be managed with the same chart and it does not require mtls.

See: https://github.com/falcosecurity/falcosidekick#prometheus

Issif avatar Feb 28 '22 09:02 Issif

FYI Falcosidekick exposes prom metrics exactly like you want, it can be managed with the same chart and it does not require mtls.

See: falcosecurity/falcosidekick#prometheus

Thanks, just noticed that! AFAICS, it exposes 3 metrics: falco, inputs, outputs

It could be very useful to use this on cloud or small clusters. But someone who actively use a log aggregator solution on cluster, Fluent Bit for example, might not want to install and maintain an external app in cluster. I mean, you'd have to do an additional HTTP request for each event times cluster count. I am not so sure how this fits in a pull-based system.

Dentrax avatar Feb 28 '22 18:02 Dentrax

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Jun 14 '22 12:06 poiana

/remove-lifecycle stale

Dentrax avatar Jun 15 '22 18:06 Dentrax

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Sep 13 '22 21:09 poiana

/remove-lifecycle stale

Dentrax avatar Sep 14 '22 06:09 Dentrax

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Dec 13 '22 09:12 poiana

/remove-lifecycle stale

Dentrax avatar Dec 21 '22 13:12 Dentrax

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Mar 21 '23 15:03 poiana

/remove-lifecycle stale

jasondellaluce avatar Mar 21 '23 17:03 jasondellaluce

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Jun 19 '23 19:06 poiana

/remove-lifecycle stale

/milestone 0.36.0

jasondellaluce avatar Jun 20 '23 12:06 jasondellaluce

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Nov 29 '23 15:11 poiana

/remove-lifecycle stale

Andreagit97 avatar Nov 30 '23 14:11 Andreagit97

cross-linking https://github.com/falcosecurity/libs/issues/1463

cc @incertum

/assign

leogr avatar Feb 19 '24 15:02 leogr