wazuh-dashboard-plugins icon indicating copy to clipboard operation
wazuh-dashboard-plugins copied to clipboard

Centralized service to restart Wazuh

Open AlexRuiz7 opened this issue 2 years ago • 4 comments

Related issue: #4181

Description

To follow the discussion on issue #4181, we need to centralize the restart of Wazuh using a React service or component, so any view of the App will be able to restart the environment in the same way.

Restrictions and considerations

  • In cluster mode, a delay of at least 15 seconds needs to be applied when a cluster restart is immediately triggered after changing the ruleset files, in order to allow the cluster to synchronize the changes along the nodes. This is the safety time the Framework team told us to use. For a detailed explanation, head to the Wazuh Cluster documentation.

    UPDATE: the @wazuh/framework team will improve their ruleset modification mechanism to distribute the changes along the cluster nodes immediately, so this delay will not be needed anymore. Issue: https://github.com/wazuh/wazuh/issues/14492

    UPDATE: the @wazuh/framework team decided to halt the development mentioned above and implement https://github.com/wazuh/wazuh/issues/14520 instead. We'll need to add a second polling mechanism to detect when the cluster is synchronized.

    In consequence, we need to clean any delay applied to the requests to restart the cluster:

    • [ ] Delete any delay applied to the requests to restart the cluster
  • It's possible that Wazuh will only exist as a cluster in the future, and the single-instance mode will exist as a single-node cluster instead. Take this in EXTRAORDINARY consideration during design and coding, so we can easily adjust this service if this eventually happens.

Requirements

  • During the restart process, the app must block any user interaction, including navigation, by deploying an overlay mask plus a modal (or similar) in which the user is provided with feedback about the restart and the actions being taken.
  • As Wazuh can be deployed as a single-instance or as a cluster, the actions to be taken differ slightly. The restart process must be able to detect the mode Wazuh is deployed, and perform the restart accordingly.
  • Once the restart order has been sent to Wazuh, the app's restart process will start a polling routine, pinging the API within a 2 seconds interval, and a maximum of 30 attempts. As soon as the API responds that Wazuh is ready, the restart process ends, meaning that the UI elements that had been added will be cleared. Otherwise, if the maximum numbers of attempts is reached, the App will automatically navigate to the Healthcheck after 5 seconds, as something did not go as expected during the restart.
  • No errors must be raised during this polling routine. Request failures are expected (the API will be down for some time).

Design

Flow

The current flow to restart Wazuh has been modeled in the following activity diagram:

Outdated.- Reason: delay to restart the cluster is no longer required. AD_Wazuh_restart Note: rev.2 - Last updated: Thu, 04 Aug 2022 13:40:42 +0200

User Interface

Note: be aware the UI design might change over time, do not take this design as final, unless explicitly specified so.

New, custom, UI components will be needed. We'll work on a PoC using several built-in components from EUI, which will include:

  • A modal-like element to display the restart status. We'll use the EUI Empty prompt component. modal_restart

  • An overlay mask component, used to move the focus to the modal, block user interaction and reinforce the feeling of a task that takes some time to complete.

  • A progress bar. There are two options here: a) countdown, starting at delay * total_attempts (2 * 30), and updated each second. b) current attempt, starting at 0 until total_attempts (30), and updated on each attempt. We need to discuss which design we like the most. Option B Note: the progress bar will only reach 0% (option A) or 100% (option B) in the worst case scenario. Wazuh should be completely restarted before this happens.

Preview

Work in progress

This is a demo for the desired design: https://codesandbox.io/s/wazuh-restart-forked-7mnxj1?file=/demo.js

modal_restart

AlexRuiz7 avatar Jun 17 '22 13:06 AlexRuiz7

Research

In the current plugin there is some methods that are being reused:

  • Management/Configuration/Edit configuration: Restart <node_name> / Restart manager
  • When adding/editing some rule/decoder/cdb list file and importing

Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287 Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309

Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340

Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369

:warning: this could be similar to another method. We should review if we could unify the behavior.

For another hand, there is another logic used by Management/Status that restarts the manager/cluster. Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92

We could try to refactor to use the same service/functions and unify the behavior.

Possible tasks:

  • [ ] Use the same service/functions in the different sections where it is required
    • [ ] Move the location of the reusable logic
  • [ ] Review if there is unused logic and remove it.

Desvelao avatar Jun 23 '22 14:06 Desvelao

Because of Wazuh 4.4.0 could only exist the cluster mode, we would need to remove logic to control when is in manager/cluster mode, we will have to refactor some methods to restart the cluster nodes or cluster, so we could do this refactor for that Wazuh version.

Desvelao avatar Jun 24 '22 08:06 Desvelao

Note

  • Some processes to restart the cluster or manager nodes could be delayed due to some files should be synchronized before restarting. For example, when creating/editing a rules/decoders file. We should disccuss with the framework/API colleagues if it is still necessary. In the current applications, the request to restart in this case is delayed so the Wazuh managers can synchronize the files before restarting. This causes the user doesn't have to wait after doing these actions, but some time later, if the user does some action related to Wazuh API, some API request could fail and redirects to the plugin health check.

Desvelao avatar Jul 27 '22 10:07 Desvelao

these functions were moved to a service to handle the restarting (wz-restart.js)

restartManager and restartCluster were changed by 1 function restart

Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287

Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549

Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369

Change

Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340

For another hand, there is another logic used by Management/Status that restarts the manager/cluster.

The 2 functions were eliminated as they were doing the same things and we started using the created service

Change

Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117

Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92

and it is also called in the file restart-cluster-manager-callout.tsx

https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/components/common/restart-cluster-manager-callout.tsx#L64

yenienserrano avatar Aug 02 '22 15:08 yenienserrano

~Blocked by https://github.com/wazuh/wazuh/issues/14776~ Blocked by https://github.com/wazuh/wazuh/issues/14918

AlexRuiz7 avatar Sep 06 '22 12:09 AlexRuiz7

Closed until priorities change

yenienserrano avatar Apr 13 '23 07:04 yenienserrano