securityonion icon indicating copy to clipboard operation
securityonion copied to clipboard

FEATURE: Allow Kibana to connect to all search nodes directly when using a true ES cluster

Open TOoSmOotH opened this issue 3 years ago • 2 comments

Discussed in https://github.com/Security-Onion-Solutions/securityonion/discussions/5418

Originally posted by rwaight September 6, 2021 In a true ES cluster, Kibana can be configured to leverage High availability across multiple Elasticsearch nodes:

Kibana can be configured to connect to multiple Elasticsearch nodes in the same cluster. In situations where a node becomes unavailable, Kibana will transparently connect to an available node and continue operating. Requests to available hosts will be routed in a round robin fashion.

In kibana.yml:

elasticsearch.hosts:
  - http://elasticsearch1:9200
  - http://elasticsearch2:9200

It looks like there would need to be an update to this code (in /salt/kibana/etc/kibana.yml) to something similar to the following:

{%- if TRUECLUSTER is sameas true %}
  {%- some syntax to pull an array of ES hosts %}
  elasticsearch.hosts: [ "https://array:9200", "https://of:9200", "https://elasticsearch:9200", "https://nodes:9200", "https://within:9200", "https://securityonion:9200", ]
{%- else %}
  elasticsearch.hosts: [ "https://{{ ES }}:9200" ]
{%- endif %}

If there is a mechanism to grab the array of search nodes, then I'd be happy to put in a pull request.

TOoSmOotH avatar Sep 07 '21 15:09 TOoSmOotH

Just curious what this will buy us since if the manager is down Kibana will be down as well?

TOoSmOotH avatar Nov 10 '21 15:11 TOoSmOotH

Just curious what this will buy us since if the manager is down Kibana will be down as well? We have two 'hot' search nodes, and kibana will behave stupidly if one goes down, I guess if kibana is contacting both directly it will not do so..? ..

Also, would love to see it becomes possible to add multiple managers to one cluster to offer true HA.

If worth anything to mention: currently we are having some sort of HA for the manager node in our system, that is a replication job running between 2 boxes to keep data in sync for the manager VM running on both "but only one have the virtual interface connected to the network", and if the box currently running goes down for any reason the other box will bring the manager interface up until the first one goes up again to disconnect it. thus we have very little downtime ~less than 1 minute, and nearly no data loss.

astroc0 avatar Dec 16 '21 09:12 astroc0