wazuh-packages icon indicating copy to clipboard operation
wazuh-packages copied to clipboard

Infinite loop when restarting `wazuh-indexer` with configuration error

Open jmv74211 opened this issue 2 years ago • 2 comments

While testing for the wazuh-indexer package in 4.3.0-rc5, I noticed that if you restart the wazuh-indexer service with an error in the /etc/wazuh-indexer/opensearch.yml configuration file, the process does not end up staying in an infinite loop and without showing any type of error.

Steps to reproduce it

  • 1.- Edit the file /etc/wazuh-indexer/opensearch.yml and set network.host to the following value
network.host: "asd"
  • 2.- Restart the wazuh-indexer service:
systemctl restart wazuh-indexer

From this point on, the process will be stuck indefinitely.

Note: This has occurred on both DEB and RPM. Tested on Centos-8 (centos/8 vagrant box image) and Debian-10 (generic/debian10 vagrant box image).

jmv74211 avatar Mar 31 '22 15:03 jmv74211

After some testing and research, it seems a specific case of error from original the OpenSearch product. We have tested the same into Elasticsearch software 7.10.2 with the same results.

Explanation: The parameter network.host is a critical parameter inside the OpenSearch+Security start runtime. The error in this parameter causes the process to enter into zombie mode and doesn't notify the systemd to stop the process.

It has a relationship with the security plugin without it, the process exit with the real error. The way to fail seems to lock Systemd for more time to start. I have checked the options related to timeout without success https://www.freedesktop.org/software/systemd/man/systemd.service.html

  • TimeoutStartFailureMode=
  • TimeoutStartSec=

Little research: https://discuss.elastic.co/t/name-or-service-not-known/284831/6 https://discuss.elastic.co/t/error-initialize-name-or-service-not-known/141357

Service file https://github.com/wazuh/wazuh-packages/blob/4.3/stack/indexer/base/files/usr/lib/systemd/system/wazuh-indexer.service

okynos avatar Apr 01 '22 12:04 okynos

It was not replicated in

  • https://github.com/wazuh/wazuh/issues/18828

pro-akim avatar Sep 12 '23 13:09 pro-akim