wazuh-dashboard-plugins
wazuh-dashboard-plugins copied to clipboard
Centralized service to restart Wazuh
Related issue: #4181
Description
To follow the discussion on issue #4181, we need to centralize the restart of Wazuh using a React service or component, so any view of the App will be able to restart the environment in the same way.
Restrictions and considerations
-
In cluster mode, a delay of at least 15 seconds needs to be applied when a cluster restart is immediately triggered after changing the ruleset files, in order to allow the cluster to synchronize the changes along the nodes. This is the safety time the Framework team told us to use. For a detailed explanation, head to the Wazuh Cluster documentation.
UPDATE: the @wazuh/framework team will improve their ruleset modification mechanism to distribute the changes along the cluster nodes immediately, so this delay will not be needed anymore. Issue: https://github.com/wazuh/wazuh/issues/14492
UPDATE: the @wazuh/framework team decided to halt the development mentioned above and implement https://github.com/wazuh/wazuh/issues/14520 instead. We'll need to add a second polling mechanism to detect when the cluster is synchronized.
In consequence, we need to clean any delay applied to the requests to restart the cluster:
- [ ] Delete any delay applied to the requests to restart the cluster
-
It's possible that Wazuh will only exist as a cluster in the future, and the single-instance mode will exist as a single-node cluster instead. Take this in EXTRAORDINARY consideration during design and coding, so we can easily adjust this service if this eventually happens.
Requirements
- During the restart process, the app must block any user interaction, including navigation, by deploying an overlay mask plus a modal (or similar) in which the user is provided with feedback about the restart and the actions being taken.
- As Wazuh can be deployed as a single-instance or as a cluster, the actions to be taken differ slightly. The restart process must be able to detect the mode Wazuh is deployed, and perform the restart accordingly.
- Once the restart order has been sent to Wazuh, the app's restart process will start a polling routine, pinging the API within a 2 seconds interval, and a maximum of 30 attempts. As soon as the API responds that Wazuh is ready, the restart process ends, meaning that the UI elements that had been added will be cleared. Otherwise, if the maximum numbers of attempts is reached, the App will automatically navigate to the Healthcheck after 5 seconds, as something did not go as expected during the restart.
- No errors must be raised during this polling routine. Request failures are expected (the API will be down for some time).
Design
Flow
The current flow to restart Wazuh has been modeled in the following activity diagram:
Outdated.- Reason: delay to restart the cluster is no longer required.
Note: rev.2 - Last updated: Thu, 04 Aug 2022 13:40:42 +0200
User Interface
Note: be aware the UI design might change over time, do not take this design as final, unless explicitly specified so.
New, custom, UI components will be needed. We'll work on a PoC using several built-in components from EUI, which will include:
-
A modal-like element to display the restart status. We'll use the EUI Empty prompt component.
-
An overlay mask component, used to move the focus to the modal, block user interaction and reinforce the feeling of a task that takes some time to complete.
-
A progress bar. There are two options here: a) countdown, starting at delay * total_attempts (2 * 30), and updated each second. b) current attempt, starting at 0 until total_attempts (30), and updated on each attempt. We need to discuss which design we like the most. Option B Note: the progress bar will only reach 0% (option A) or 100% (option B) in the worst case scenario. Wazuh should be completely restarted before this happens.
Preview
Work in progress
This is a demo for the desired design: https://codesandbox.io/s/wazuh-restart-forked-7mnxj1?file=/demo.js
Research
In the current plugin there is some methods that are being reused:
-
Management/Configuration/Edit configuration
:Restart <node_name>
/Restart manager
- When adding/editing some rule/decoder/cdb list file and importing
Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287 Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309
Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549
Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340
Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369
:warning: this could be similar to another method. We should review if we could unify the behavior.
For another hand, there is another logic used by Management/Status
that restarts the manager/cluster.
Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117
Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92
We could try to refactor to use the same service/functions and unify the behavior.
Possible tasks:
- [ ] Use the same service/functions in the different sections where it is required
- [ ] Move the location of the reusable logic
- [ ] Review if there is unused logic and remove it.
Because of Wazuh 4.4.0 could only exist the cluster mode, we would need to remove logic to control when is in manager/cluster mode, we will have to refactor some methods to restart the cluster nodes or cluster, so we could do this refactor for that Wazuh version.
Note
- Some processes to restart the cluster or manager nodes could be delayed due to some files should be synchronized before restarting. For example, when creating/editing a rules/decoders file. We should disccuss with the framework/API colleagues if it is still necessary. In the current applications, the request to restart in this case is delayed so the Wazuh managers can synchronize the files before restarting. This causes the user doesn't have to wait after doing these actions, but some time later, if the user does some action related to Wazuh API, some API request could fail and redirects to the plugin health check.
these functions were moved to a service to handle the restarting (wz-restart.js)
restartManager and restartCluster were changed by 1 function restart
Restart selected manager node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L270-L287
Restart cluster or manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L528-L549
Restart node: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L346-L369
Change
Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L293-L309
Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/configuration/utils/wz-fetch.js#L315-L340
For another hand, there is another logic used by Management/Status
that restarts the manager/cluster.
The 2 functions were eliminated as they were doing the same things and we started using the created service
Change
Restart manager: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L97-L117
Restart cluster: https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/controllers/management/components/management/status/actions-buttons-main.js#L68-L92
and it is also called in the file restart-cluster-manager-callout.tsx
https://github.com/wazuh/wazuh-kibana-app/blob/v4.3.4-7.10.2/public/components/common/restart-cluster-manager-callout.tsx#L64
~Blocked by https://github.com/wazuh/wazuh/issues/14776~ Blocked by https://github.com/wazuh/wazuh/issues/14918
Closed until priorities change