kibana icon indicating copy to clipboard operation
kibana copied to clipboard

[Detection Engine] Adds Alert Suppression to ML Rules

Open rylnd opened this issue 2 months ago • 29 comments

Summary

This PR introduces Alert Suppression for ML Detection Rules. This feature is behaviorally similar to alerting suppression for other Detection Engine Rule types, and nearly identical to the analogous features for EQL rules.

There are some additional UI behaviors introduced here as well, mainly intended to cover the shortcomings discovered in https://github.com/elastic/kibana/issues/183100. Those behaviors are:

  1. Populating the suppression field list with fields from the anomaly index(es).
  2. Disabling the suppression UI if no selected ML jobs are running (because we cannot populate the list of fields on which they'll be suppressing).
  3. Warning the user if some selected ML jobs are not running (because the list of suppression fields may be incomplete).

See screenshots below for more info.

Intermediate Serverless Deployment

As per the "intermediate deployment" requirements for serverless, while the schema (and declared alert SO mappings) will be extended to allow this functionality, the user-facing features are currently hidden behind a feature flag. Once this is merged and released, we can issue a "final" deployment in which the feature flag is enabled, and the feature effectively released.

Screenshots

  • Overview of new UI fields Screenshot 2024-05-16 at 3 22 02 PM
  • Example of Anomaly fields in suppression combobox Screenshot 2024-06-06 at 5 14 17 PM
  • Suppression disabled due to no jobs running Screenshot 2024-06-17 at 11 23 39 PM
  • Warning due to not all jobs running Screenshot 2024-06-17 at 11 26 16 PM

Steps to Review

  1. Review the Test Plan for an overview of behavior
  2. Review Integration tests for an overview of implementation and edge cases
  3. Review Cypress tests for an overview of UX changes
  4. Testing on Demo Instance (elastic/changeme)
    1. This instance has the relevant feature flag enabled, has some sample auditbeat data, as well as the anomalies archive data for the purposes of exercising an ML rule against "real" anomalies
    2. There are a few example rules in the default space:
      1. A simple query rule against auditbeat data
      2. An ML rule with per-execution suppression on both by_field_name and by_field_value (which ends up not actually suppressing anything)
      3. An ML rule with per-execution suppression on by_field_name (which suppresses all anomalies into a single alert)

Related Issues

  • This feature was temporarily blocked by https://github.com/elastic/kibana/issues/183100, but those changes are now in this PR.

Checklist

  • [x] Functional changes are hidden behind a feature flag. If not hidden, the PR explains why these changes are being implemented in a long-living feature branch.
  • [x] Functional changes are covered with a test plan and automated tests.
  • [ ] Stability of new and changed tests is verified using the Flaky Test Runner in both ESS and Serverless. By default, use 200 runs for ESS and 200 runs for Serverless.
    • ESS - Cypress x 200
    • Serverless - Cypress x 200
    • ESS - API x 200
    • Serverless - API x 200
  • [ ] Comprehensive manual testing is done by two engineers: the PR author and one of the PR reviewers. Changes are tested in both ESS and Serverless.
  • [ ] Mapping changes are accompanied by a technical design document. It can be a GitHub issue or an RFC explaining the changes. The design document is shared with and approved by the appropriate teams and individual stakeholders.
  • [ ] (OPTIONAL) OpenAPI specs changes include detailed descriptions and examples of usage and are ready to be released on https://docs.elastic.co/api-reference. NOTE: This is optional because at the moment we don't have yet any OpenAPI specs that would be fully "documented" and "GA-ready" for publishing on https://docs.elastic.co/api-reference.
  • [ ] Functional changes are communicated to the Docs team. A ticket is opened in https://github.com/elastic/security-docs using the Internal documentation request (Elastic employees) template. The following information is included: feature flags used, target ESS version, planned timing for ESS and Serverless releases.

rylnd avatar Apr 26 '24 22:04 rylnd