falco
falco copied to clipboard
[TRACKING] Expand existing `webserver` to serve Prometheus `/metrics` endpoint
Expand existing webserver to serve Prometheus /metrics endpoint.
The initial Prometheus tracking ticket lives here: https://github.com/falcosecurity/cncf-green-review-testing/issues/12.
Work Items:
- [x]
libsmetrics refactor supporting text-based Prometheus exposition format https://github.com/falcosecurity/libs/pull/1652 @incertum - [x] Sync
libsreferencing the newlibs_metrics_collector, defer any other refactors (read below) https://github.com/falcosecurity/falco/pull/3129 @incertum - [ ] Expand existing
webserverto serve Prometheus/metricsendpoint. TODO @sgaist - [ ] Given the new expanded metrics scope in falco, the current approach within
stats_writer::collector::collectis no good anymore and we shall refactor it, also in anticipation to add yet another metrics category for rules counters. TODO @sgaist
/assign @sgaist
As discussed hand-off to Samuel 🙏
@incertum: GitHub didn't allow me to assign the following users: sgaist.
Note that only falcosecurity members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide
In response to this:
/assign @sgaist
As discussed hand-off to Samuel 🙏
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@sgaist can you assign yourself, the bot doesn't allow me to do it. Thanks!
/milestone 0.38.0
CC @leogr
The libs unit tests should give a clear idea of the Prometheus metrics strings you should expect. Here are a few examples of metrics outside of libs metrics that would need to be created in Falco.
# HELP testns_falco_kernel_release_info https://falco.org/docs/metrics/
# TYPE testns_falco_kernel_release_info gauge
testns_falco_kernel_release_info{raw_name="kernel_release",kernel_release="6.6.7-200.fc39.x86_64"} 1
# HELP testns_falco_duration_seconds_total https://falco.org/docs/metrics/
# TYPE testns_falco_duration_seconds_total counter
testns_falco_duration_seconds_total{raw_name="duration_sec"} 144
# HELP testns_falco_evt_rate_seconds https://falco.org/docs/metrics/
# TYPE testns_falco_evt_rate_seconds gauge
testns_falco_evt_rate_seconds{raw_name="evt_rate_sec"} 126065.400000
# HELP testns_falco_host_boot_timestamp_nanoseconds https://falco.org/docs/metrics/
# TYPE testns_falco_host_boot_timestamp_nanoseconds gauge
testns_falco_host_boot_timestamp_nanoseconds{raw_name="host_boot_ts"} 1708753667000000000
They are also part of the libs unit tests.
In addition re the Prometheus namespace + subsystem I would suggest the following:
falcosecurity_falco_*falcosecurity_scap_*
The subsystem should match the output rules metrics naming conventions https://falco.org/docs/metrics/falco-metrics/
Finally, since we did this major libs metrics refactor it could be nice to also refactor the metrics output handling for falco since we are now expanding the scope with Prometheus. You will find a nice solution 😉 no doubt @sgaist !
@incertum PRs are merged both into libs and in Falco master. @sgaist you can start your work whenever you wish :)
Noted and starting :-)
/assign @sgaist