apm-server
apm-server copied to clipboard
Fix/Improve APM Server Smoke Tests
We have been repeatedly running into issues with smoketests and they need some re-work.
This task is to
(1) check whether there is some smoke test failure fatigue. Are there common scenarios where smoke tests fail and it is expected, e.g. after a version bump? If that is the case, can we make it easier to quickly distinguish such failures from other failures? E.g. does it make sense to split some of the smoke tests up, and have separate triggers and slack notifications for them? Can the failure message and which exact test failed be brought back to slack? Other ideas?
(2) The tests for latest are shown as success, but really they are skipped:
I assume the latest is supposed to test the latest 7.x and 8.x versions that are not yet released. We need to have smoke tests running for unreleased versions.
We use the Artifactory to know which versions are available, we process the JSON there and store the builds IDs when they are available. These are the snippets we use, we can give you the URL of our version file one is in the apm-pipeline-library, @v1v Do we have some other file? We also have the updated channels we use for the bolt clusters. It is a JSON file with the latest build we have tested
7.x versions available
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '.aliases[]|select(.|startswith("7"))'
Two latest builds for 7.17
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
VERSION=7.17
curl -sSf "https://artifacts-api.elastic.co/v1/versions/${VERSION}/builds?${NO_KPI_URL_PARAM}"|jq '.builds[:2]'
8.x versions available
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '.aliases[]|select(.|startswith("8"))'
Two latest builds for 8.14
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
VERSION=8.15
curl -sSf "https://artifacts-api.elastic.co/v1/versions/${VERSION}/builds?${NO_KPI_URL_PARAM}"|jq '.builds[:2]'
8 SNAPSHOT versions
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '[.aliases[]|select(.|startswith("8"))|select(.|endswith("SNAPSHOT"))]'
No 8 SNAPSHOT versions
NO_KPI_URL_PARAM="x-elastic-no-kpi=true"
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '[.aliases[]|select(.|startswith("8"))|select(.|endswith("SNAPSHOT")|not)]'
latest 8 version
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '[.aliases[]|select(.|startswith("8"))|select(.|endswith("SNAPSHOT")|not)][-1:]'
latest 8 SNAPSHOT version
curl -sSf "https://artifacts-api.elastic.co/v1/versions?${NO_KPI_URL_PARAM}" | jq '[.aliases[]|select(.|startswith("8"))|select(.|endswith("SNAPSHOT"))][-1:]'
Do we have some other file?
Yes, there is a Google Bucket with the latest releases for 8.x and 7.x and the upcoming minor or patches releases:
- Current version in main: https://storage.googleapis.com/artifacts-api/releases/edge/main
- Current release for
8.x: https://storage.googleapis.com/artifacts-api/releases/current/8 - Current release for
7.x: https://storage.googleapis.com/artifacts-api/releases/current/7 - Next patch release for
8.x: https://storage.googleapis.com/artifacts-api/releases/next/patch-8 - Next patch release for
7.x: https://storage.googleapis.com/artifacts-api/releases/next/patch-7 - Next minor release for
8.x: https://storage.googleapis.com/artifacts-api/releases/next/minor-8
Those entries change based on the artifacts-api.elastic.co, it does some transformation to be able to use the aliases edge, next, current, patch and minor.
Regarding the build-id, there is another entry in the Google bucket to know the latest available snapshot for a release branch:
- https://storage.googleapis.com/artifacts-api/snapshots/
<branch-name>.json
<branch-name> can be main, 7.17, 8.14 and so on.
For instance:
curl https://storage.googleapis.com/artifacts-api/snapshots/8.14.json
{
"start_time": "Thu, 25 Apr 2024 04:05:10 GMT",
"release_branch": "8.14",
"prefix": "",
"end_time": "Thu, 25 Apr 2024 05:58:07 GMT",
"manifest_version": "2.1.0",
"version": "8.14.0-SNAPSHOT",
"branch": "8.14",
"build_id": "8.14.0-055b54fd",
"build_duration_seconds": 6777
}
or
{
"start_time": "Thu, 25 Apr 2024 02:02:24 GMT",
"release_branch": "master",
"prefix": "",
"end_time": "Thu, 25 Apr 2024 04:01:35 GMT",
"manifest_version": "2.1.0",
"version": "8.15.0-SNAPSHOT",
"branch": "master",
"build_id": "8.15.0-eb13af64",
"build_duration_seconds": 7151
}
This is handy since it stores the latest available snapshot for any release branch, while artifacts-api.elastic.co deletes any references to any branch with artefacts older than 30 days.
That's the reason we cannot query the latest snapshot for 8.9 in artifacts-api.elastic.co but we can using the Google bucket:
curl https://storage.googleapis.com/artifacts-api/snapshots/8.9.json
{
"start_time": "Tue, 29 Aug 2023 08:20:45 GMT",
"release_branch": "8.9",
"prefix": "",
"end_time": "Tue, 29 Aug 2023 12:34:12 GMT",
"manifest_version": "2.1.0",
"version": "8.9.2-SNAPSHOT",
"branch": "8.9",
"build_id": "8.9.2-a804d52d",
"build_duration_seconds": 15207
}
I assume the latest is supposed to test the latest 7.x and 8.x versions that are not yet released. We need to have smoke tests running for unreleased versions.
~~Upon creating #13147 , I wonder if I understand the task correctly. Do we want to test latest unreleased version (staging DRA) or latest snapshot version?~~
Fixed via #13147