falco icon indicating copy to clipboard operation
falco copied to clipboard

[TRACKING] Expand existing `webserver` to serve Prometheus `/metrics` endpoint

Open incertum opened this issue 1 year ago • 8 comments

Expand existing webserver to serve Prometheus /metrics endpoint.

The initial Prometheus tracking ticket lives here: https://github.com/falcosecurity/cncf-green-review-testing/issues/12.


Work Items:

  • [x] libs metrics refactor supporting text-based Prometheus exposition format https://github.com/falcosecurity/libs/pull/1652 @incertum
  • [x] Sync libs referencing the new libs_metrics_collector, defer any other refactors (read below) https://github.com/falcosecurity/falco/pull/3129 @incertum
  • [ ] Expand existing webserver to serve Prometheus /metrics endpoint. TODO @sgaist
  • [ ] Given the new expanded metrics scope in falco, the current approach within stats_writer::collector::collect is no good anymore and we shall refactor it, also in anticipation to add yet another metrics category for rules counters. TODO @sgaist

incertum avatar Mar 07 '24 23:03 incertum

/assign @sgaist

As discussed hand-off to Samuel 🙏

incertum avatar Mar 07 '24 23:03 incertum

@incertum: GitHub didn't allow me to assign the following users: sgaist.

Note that only falcosecurity members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

/assign @sgaist

As discussed hand-off to Samuel 🙏

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

poiana avatar Mar 07 '24 23:03 poiana

@sgaist can you assign yourself, the bot doesn't allow me to do it. Thanks!

incertum avatar Mar 07 '24 23:03 incertum

/milestone 0.38.0

CC @leogr

incertum avatar Mar 07 '24 23:03 incertum

The libs unit tests should give a clear idea of the Prometheus metrics strings you should expect. Here are a few examples of metrics outside of libs metrics that would need to be created in Falco.

# HELP testns_falco_kernel_release_info https://falco.org/docs/metrics/
# TYPE testns_falco_kernel_release_info gauge
testns_falco_kernel_release_info{raw_name="kernel_release",kernel_release="6.6.7-200.fc39.x86_64"} 1

# HELP testns_falco_duration_seconds_total https://falco.org/docs/metrics/
# TYPE testns_falco_duration_seconds_total counter
testns_falco_duration_seconds_total{raw_name="duration_sec"} 144

# HELP testns_falco_evt_rate_seconds https://falco.org/docs/metrics/
# TYPE testns_falco_evt_rate_seconds gauge
testns_falco_evt_rate_seconds{raw_name="evt_rate_sec"} 126065.400000

# HELP testns_falco_host_boot_timestamp_nanoseconds https://falco.org/docs/metrics/
# TYPE testns_falco_host_boot_timestamp_nanoseconds gauge
testns_falco_host_boot_timestamp_nanoseconds{raw_name="host_boot_ts"} 1708753667000000000

They are also part of the libs unit tests.

In addition re the Prometheus namespace + subsystem I would suggest the following:

  • falcosecurity_falco_*
  • falcosecurity_scap_*

The subsystem should match the output rules metrics naming conventions https://falco.org/docs/metrics/falco-metrics/

Finally, since we did this major libs metrics refactor it could be nice to also refactor the metrics output handling for falco since we are now expanding the scope with Prometheus. You will find a nice solution 😉 no doubt @sgaist !

incertum avatar Mar 08 '24 00:03 incertum

@incertum PRs are merged both into libs and in Falco master. @sgaist you can start your work whenever you wish :)

FedeDP avatar Mar 14 '24 14:03 FedeDP

Noted and starting :-)

sgaist avatar Mar 14 '24 22:03 sgaist

/assign @sgaist

sgaist avatar Mar 14 '24 22:03 sgaist