searchlight
searchlight copied to clipboard
Make it possible to configure incinga as satellite
It would be handy it it would be possible to configure the icinga in searchlight as a satellite of an other icinge master server. The Icinga Master server would not neccessarily run in the same kubernetes.
There are CLI parameters to initialize an icinga as a satellite https://www.icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/#node-setup-with-satellitesclients
I think it would ne necessary to pass the configuration parameters to the searchlight icinga in the deployment configuration or in a config map
Thanks @MatrixCrawler ! I am not familiar with the benefits of a master/satellite setup. Can you tell me more about your use-case and what are you trying to accomplish?
Since this is a feature, that I'm interested in too, I will try to elaborate on what it is and why it would be beneficial.
How it works (short version)
Icinga2 integrates an inherent concept of distributed monitoring. The short version of this that Icinga2 can model the following hierarchy:
masternodes at the top-level, providing the services of topology knowledge (modelled aszonesin Icinga2), state persistence (usuallyIDOdbin Icinga2), notification (sending emails) & metrics (perfdatain Icinga2)satellitenodes in the leaves or as intermediates, providing definition & execution of checks for a particularzone(this is also done bymasternodes in non-zone-setups) and optional also notifications & metricsagentnodes in the leaves, only providing execution of checks (think about them as equivalent ofnrpein Nagios)
Everyone of those setup types (they all use the same Icinga2 binary) define a zone, which are arranged hierarchically, and report their results upstream. At any level, if nodes are assigned to the same zone, they act as a load-balancing/fail-over setup (for example 2 master nodes execute checks in a distributed manner)
The usual setup is having satellites in network segments and agents on every leaf node (at least if check_by_ssh or sth. similar from a master or satellite is not an option)
How do nodes communicate
Icinga2 follows a "secure by design" approach. The standard setups involves a X509 certificate authority at the master level that signs certificates for every node. The actual communication uses a JSON-RPC protocol via HTTPS, with connection initiated by either side, meaning upstream & downstream node both try to establish a connection in some way and thereby mitigating problems with NAT and firewalls in general). This is usually done on a well known port (5665).
Why do we want this?
If there is already an Icinga2 master setup in place, having Searchlight act as a satellite allows integrating a Kubernetes cluster into existing monitoring infrastructure at a very low overhead. We can define & execute K8S checks inside the cluster, while utilizing the existing notifications (and dashboards for that matter) on the outside. It wouldn't really be necessary to implement the agent node level in Kubernetes, since it is a cluster and we can monitor it as a whole (though it might be a nice-to-have for appliance-style installation like CoreOS)
I hope this gives some ideas about how it works and why it would be nice to have. I will try to be available for further questions. As soon as I read into the existing codebase I may also provide ideas/code on how to achieve this.
Initial idea of how to achieve a satellite setup:
- Allow providing a signed TLS key pair for the satellite via
Secret - Spin up a
Deploymentof a suitable amount of replicas for the cluster with thisSecretand an off-the-shelf satellite configuration for azonenamed like the entrypoint (e.g. public DNS name) of the cluster (kubernetes.example.com) - Expose a
NodePortservice on the default port (5665)
If there is an Ingress controller in place, that supports SSL passthrough, we could also expose it via Ingress, instead of NodePort.
Thanks @punycode for elaborating on this.
Any updates/plans on this?
The next release for Searchlight is feature complete . The of big things that are coming in this release are
(1) webhook based plugin for icinga checks,
(2) using workqueue (not user visible, but fixes various subtle retry issues),
(3) pause alert (instead of deleting CRd yaml, you can pause it to temporarily deactivate the check)
(4) alert history is stored as a new Incident crd.
We can discuss the on potential design for satellite support at this time. From @punycode 's comment above, the main change seems to be updating the docker image for Searchlight to provide these extra info and not run icingaweb in satellite. And document the process .
Is there good document that shows the process step by step so that we can replicate it? I found https://www.icinga.com/docs/icinga2/latest/doc/06-distributed-monitoring/ . If you are interested in contributing to this feature this will be a great time. :) Please sign up for our Slack https://slack.appscode.com/ . We can talk about it more there if you want to contribute. If you have just feedback, you can just reply here too.
Thank you all for trying Searchlight and having your patience.
@tamalsaha do you have any schema how you change icinga config or where in code i can find this?
This icinga Dockerfiles are here: https://github.com/appscode/searchlight/tree/master/hack/docker/icinga/alpine
Any configuration passed to icinga container is via a Secret --config-secret-name. The currently supported keys are listed here: https://github.com/appscode/searchlight/blob/master/pkg/icinga/configurator.go#L20
The searchlight pod takes the data from Secret, fills in anything missing that can be set to defaults or auto generated (eg certs) and then write the icinga config.ini file. Icinga container waits until that file is available. https://github.com/appscode/searchlight/blob/master/hack/docker/icinga/alpine/runit.sh#L8
So, any extra parameter we need to pass should follow a similar pattern.