kibana
kibana copied to clipboard
[Observability solution] [SLO] Run burn rate api tests in serverless & ess using mocha tagging
Addresses https://github.com/elastic/kibana/issues/179549
Summary
This POC is the outcome of this R&D issue for having deployment agnostic tests for SLO and o11y alerting features. We based our work on the Mocha tagging
approach of the Security Solution and applied it to the slo burn rate rule type. We plan to migrate all our existing API tests listed in the R&D issue in follow up PRs.
The idea is that a) we are going to write our api integration tests in a new location observability_solution_api_integration
shared by b) ess & serverless configurations
. Each configuration will read from a specific location and load the corresponding services. Tests will be written only once and should be tagged with labels depending on which environments they need to be run.
describe('@ess @serverless SLO burn rate rule, () => {
describe('Create rule', () => {
});
describe('@skipInServerless missing something', () => {
});
}
Description
- This PR follows the second option defined in this document, the Mocha tagging. We decide through following labels in which environment the tests are going to be executed:
- @ess: Runs in an ESS environment (on-prem installation) as part of the CI validation on PRs.
- @serverless: Runs in the first quality gate and in the periodic pipeline.
- @skipInEss: Skipped for ESS environment.
- @skipInServerless: Skipped for all quality gates and periodic pipeline.
- It introduces a new folder
x-pack/test/observability_solution_api_integration
which will serve as a centralized location for all tests by obs-ux-management team that must be run in Serverless and ESS environments. A list of all tests can be found in the R&D issue - Within this folder, there is a "config" subdirectory that stores base configurations specific to both the Serverless and ESS environments. These configurations build upon the base configuration provided by test_serverless and api_integration, incorporating additional settings such as environment variables and tagging options.
- The file
x-pack/test/observability_solution_api_integration/test_suites/slo/burn_rate_rule.ts
is functional in both Serverless and ESS - It removes the existing burn rate rule from
x-pack/test_serverless/api_integration/test_suites/observability/burn_rate_rule/burn_rate_rule.ts
- The alerting_api and slo_api services have been moved to the new folder and will be removed once all tests are migrated to the new folder to not break existing tests
CI
- It includes a new entry in the
ftr_configs.yml
to execute the newly added tests in the pipeline. - It involves the addition of
mochaOptions
in both serverless/config.base.ts and ess/config.base.ts. In the case of serverless, it includes @serverless while excluding @skipInServerless. Similarly, for ess, it includes @ess and excludes @skipInEss.
Quality Gates and periodic pipelines
The first quality gate is the execution of the tests as part of the PR check process. Tests are executed on a mocked serverless enviroment (not MKI). Failures are not blocking a release but are blocking PRs to be merged.
The serverless tests executed as part of the PR check, use a stateless Elasticsearch. The periodic pipeline, which is executed every 4 hours is the health check of our tests in MKI environments.
TODO:
- [x] summarize the approach we use
- [x] move alerting api in a common location that can be reused by both environments (~~currently ess is broken, will be fixed once alerting api is in the common place~~)
- [x] remove burn rate rule from the old location (test_serverless)
- [ ] think a bit more about the structure of the newly introduced observability_solution_api_integration folder and how it can be re-used by rest observability apps
:robot: GitHub comments
Expand to view the GitHub comments
Just comment with:
-
/oblt-deploy
: Deploy a Kibana instance using the Observability test environments. -
run
docs-build
: Re-trigger the docs validation. (use unformatted text in the comment!)
This PR makes it possible to write deployment agnostic tests using Mocha tagging. I tested it locally by running the scripts I added to the new package.json. Not sure whose area is this (appex-qa or kibana-operations), but my question here is what else needs to be done to make these tests:
- part of the CI pipeline. I added new configs in the
.buildkite/ftr_configs.yml
. Is it enough or do I need to make any more changes? - part of MKI
@elasticmachine merge upstream
@elasticmachine merge upstream
So while your new configs are included in regular local CI runs, you don't get ESS or MKI tests for free. You would have to set up your own pipeline
@pheyos Thanks a lot for taking the time to review this PR. You are absolutely right, this PR only runs the tests in a local simulated environment. This PR is only the first step. We are aware that extra things need to be set up, so that tests can run in a real MKI environemnt. We would need guidance on how we can setup our own pipeline. Do you have any documentation?
@MadameSheema After merging this initial PR, what were the next steps you did to make your tests run in an MKI environment? Do you have any link to a PR that we could take a look at?
As we discussed in zoom, authentication is quite different in stateful and serverless, which is the main reason why separate test directories have been introduced in the first place. Regarding authentication, we plan to follow this approach. I am going to try SAML authentication on a POC for functional tests.
Let's further discuss a few things on a Zoom call next week.
I'm putting here an EXCELLENT list of PRs @MadameSheema compiled for us! Thanks a tons! Such a great help! This is all the work the Security team has done regarding having tests executed in MKI projects and integrated in buildkite. They have a periodic pipeline that executes all the tests marked as @serverless and they don't have the @skipServerlessInMKI
- https://github.com/elastic/kibana/pull/169311 -> Cypress orchestrator that creates a real project and executes the tests there
- https://github.com/elastic/kibana/pull/169422 -> Initial FTR API Integration with quality gate
- https://github.com/elastic/kibana/pull/171859 -> This is the point where the periodic pipeline was introduced. It changes in the next PRs listed here
- https://github.com/elastic/kibana/pull/176410 -> Change required by the CP team for the reset credentials call
- https://github.com/elastic/kibana/pull/179145 -> Initial split from one pipeline including all teams to per team pipeline and Buildkite Test suite integration
- https://github.com/elastic/kibana/pull/181027 -> Introducing the multiple organizations required behind the test runtime, in order to unblock the concurrent projects that can exist (previous CAP 50) CAP 100.
- https://github.com/elastic/kibana/pull/182245 -> FTR Tests, Refactored the old bash script which handles the projects CRUD to a new TS file which follows the same way with parallel_serverless. Now QUALITY_GATE=1 env var is the one which defines if the release scripts (Quality gate) or the qa scripts (Periodic pipeline) is running
- https://github.com/elastic/kibana/pull/182626 -> Try catch statement in the PR for the API Ftr Integration tests, introduced a cover above the target fix of the specific PR. The tests were failing however the exit code was still 0 so the failures are hidden. This PR addresses the specific issue
- https://github.com/elastic/kibana/pull/183612 -> Splits the FTR tests in the per team pipeline. Now the pipelines have both API and Cypress tests
Here are a few more resources
- https://github.com/elastic/kibana/pull/180773
- https://github.com/elastic/kibana/pull/181371
@dominiqueclarke Do you have an idea how we can fix this error Definition for rule '@kbn/eslint/require_mocha_tagging' was not found @kbn/eslint/require_mocha_tagging
? It was introduced after this commit
:broken_heart: Build Failed
- Buildkite Build
- Commit: 330ef06c34ab6fbaa7291ae9333837f28e4da1b2
- Interpreting CI Failures
- Kibana Serverless Image:
docker.elastic.co/kibana-ci/kibana-serverless:pr-183113-330ef06c34ab
Failed CI Steps
- Jest Tests #1
- Jest Tests #1
- Jest Integration Tests #1
- FTR Configs #32
- FTR Configs #82
- FTR Configs #82
- Linting
Test Failures
- [job] [logs] Jest Tests #1 / @kbn/eslint/require_mocha_tagging invalid describe('API Integration test', () => {})
- [job] [logs] Jest Tests #1 / @kbn/eslint/require_mocha_tagging invalid describe('API Integration test', () => {})
- [job] [logs] FTR Configs #82 / EPM Endpoints installing with hidden datastream should rollover hidden datastreams when failed to update mappings
- [job] [logs] FTR Configs #82 / EPM Endpoints installing with hidden datastream should rollover hidden datastreams when failed to update mappings
Metrics [docs]
Canvas Sharable Runtime
The Canvas "shareable runtime" is an bundle produced to enable running Canvas workpads outside of Kibana. This bundle is included in third-party webpages that embed canvas and therefor should be as slim as possible.
id | before | after | diff |
---|---|---|---|
module count |
- | 5412 | +5412 |
total size |
- | 8.8MB | +8.8MB |
History
- :broken_heart: Build #212384 failed 4d2182099a566d06c99cfd7085ef0ca292efc6f2
- :broken_heart: Build #210750 failed 6aa6b280aa10c772e07641f92065f9d0e548cb3b
- :broken_heart: Build #210670 failed 92db76ad36095deaabb66b5fbe8df9af8a1d1d1b
- :broken_heart: Build #210573 failed 79f51ba92c105d053d933c5858c139e8e32f222c
To update your PR or re-run it, just comment with:
@elasticmachine merge upstream
cc @mgiota
:broken_heart: Build Failed
- Buildkite Build
- Commit: 778fd0c377a36ce14cb91e3e6aeff90b389565e9
- Kibana Serverless Image:
docker.elastic.co/kibana-ci/kibana-serverless:pr-183113-778fd0c377a3
Failed CI Steps
- Jest Tests #4
- Jest Tests #4
- FTR Configs #1
- FTR Configs #1
- FTR Configs #14
- FTR Configs #14
- FTR Configs #15
- FTR Configs #15
- FTR Configs #29
- FTR Configs #29
- FTR Configs #34
- FTR Configs #34
- FTR Configs #41
- FTR Configs #41
- FTR Configs #55
- FTR Configs #55
- FTR Configs #56
- FTR Configs #56
- FTR Configs #61
- FTR Configs #61
- FTR Configs #62
- FTR Configs #62
- FTR Configs #66
- FTR Configs #66
- FTR Configs #68
- FTR Configs #68
- FTR Configs #76
- FTR Configs #76
- FTR Configs #79
- FTR Configs #79
- FTR Configs #81
- FTR Configs #81
- FTR Configs #97
- FTR Configs #97
- FTR Configs #100
- FTR Configs #100
- Linting
Metrics [docs]
Canvas Sharable Runtime
The Canvas "shareable runtime" is an bundle produced to enable running Canvas workpads outside of Kibana. This bundle is included in third-party webpages that embed canvas and therefor should be as slim as possible.
id | before | after | diff |
---|---|---|---|
module count |
- | 5412 | +5412 |
total size |
- | 8.8MB | +8.8MB |
History
cc @mgiota
Here's an update regarding the deployment agnostic tests and this POC.
FTR tests already make use of the it.tags(['my-tag-1', 'my-tag-2'])
pattern, and so we would like to stay with that pattern rather than introducing a new type of mocha tag in the it block's "description". The main trade-off for this is that this tagging is only available at the suite level ("describe") and not individual test level ("it"). However, we consider this not a blocker since tests within one suite are usually all good for the same environment and the few exceptions can easily be handled.
That being said I am going to close this PR and open a new one using suiteTags
. Here's the commit where I converted from mochaOps
to suiteTags
and I verify that all works fine.
Adding following in the config files:
serverless config
suiteTags: {
include: ['myIncludeServerlessTag'],
exclude: ['myExcludeServerlessTag'],
},
ess config
suiteTags: {
include: ['myIncludeEssTag'],
exclude: ['myExcludeEssTag'],
},
and then adding for example this.tags(['myIncludeEssTag', 'myIncludeServerlessTag'])
in the test suite instructs the test runner to run the same test suite in both environments.
cc @jasonrhodes @pheyos
@pheyos Before moving on with the new branch that will contain in its history only the necessary changes (suiteTags and nothing related to mochaOps) I wanted to check what are the conflicts of this branch with main. These conflicts are related to the operator privilege issue. We need to sync up on how to approach this.
@pheyos Here's what I tried and looks like it works.
The idea is that I have only one sloApi service defined within test/api_integration
folder (and not test_serverless), which in turn uses the config
service to check if it is a serverless environment and act accordingly, which is to use the svlUserManager
service and create an api key role for admin.
As you can see below in case of ess it creates all the predefined roles and users, where as on serverless it only creates an api key.
ess
serverless
What do you think? Would this approach work for you?
I am closing this POC, in favor of https://github.com/elastic/kibana/pull/187924, where I use suiteTags
instead of mocha Tagging