datahub
datahub copied to clipboard
fix(elasticsearch) Analytics indices creation on AWS ES
π₯ Goal
Solve issue #5376 with analytics Elasticsearch indices being created incorrectly on AWS ES and the Analytics Datahub page then not working.
π Details
When running against AWS Elasticsearch (aka Amazon OpenSearch), analytics indices tend to have problems (see issue #5376 or search Slack for datahub_usage_event-000001
). This PR introduces three changes in the create-indices.sh
script:
- refactoring the script: It contained many copy-pasting and was not easy to follow or maintain. Adding comments, extracting repeatadly-used operations into functions, unifying approaches.
-
adding index fix: When the script detects that the
datahub_usage_event
index was created incorrectly (probably by GMS when running withUSE_AWS_ELASTICSEARCH
incorrectly not set), it drops it and recreates it. This is should help many struggling developers. -
configuration hint: The script tries to detect whether the
USE_AWS_ELASTICSEARCH
should have been used after ES endpoint error and writes a hint about its usage.
π§ͺ Testing
Building the modified elasticsearch-setup-job
image and using it in my Datahub helm charts, then deploying using these charts.
My setup uses Amazon Opensearch. Didn't test with the other case.
Case 1: clean slate
- Nuking everything
- Deploy the helm charts
- Result: indexes created successfully
elasticsearch-setup-job log
2022/07/28 17:12:40 Waiting for: https://xxx.es.amazonaws.com:443
2022/07/28 17:12:40 Received 200 from https://xxx.es.amazonaws.com:443
>>> creating _opendistro/_ism/policies/datahub_usage_event_policy ...
{
"policy": {
"policy_id": "datahub_usage_event_policy",
"description": "Datahub Usage Event Policy",
"default_state": "Rollover",
"schema_version": 1,
"states": [
{
"name": "Rollover",
"actions": [
{
"rollover": {
"min_index_age": "1d"
}
}
],
"transitions": [
{
"state_name": "ReadOnly",
"conditions": {
"min_index_age": "7d"
}
}
]
},
{
"name": "ReadOnly",
"actions": [
{
"read_only": {}
}
],
"transitions": [
{
"state_name": "Delete",
"conditions": {
"min_index_age": "60d"
}
}
]
},
{
"name": "Delete",
"actions": [
{
"delete": {}
}
],
"transitions": []
}
],
"ism_template": {
"index_patterns": [
"datahub_usage_event-*"
],
"priority": 100
}
}
}{"_id":"datahub_usage_event_policy","_version":1,"_primary_term":1,"_seq_no":0,"policy":{"policy":{"policy_id":"datahub_usage_event_policy","description":"Datahub Usage Event Policy","last_updated_time":1659028360937,"schema_version":1,"error_notification":null,"default_state":"Rollover","states":[{"name":"Rollover","actions":[{"rollover":{"min_index_age":"1d"}}],"transitions":[{"state_name":"ReadOnly","conditions":{"min_index_age":"7d"}}]},{"name":"ReadOnly","actions":[{"read_only":{}}],"transitions":[{"state_name":"Delete","conditions":{"min_index_age":"60d"}}]},{"name":"Delete","actions":[{"delete":{}}],"transitions":[]}],"ism_template":[{"index_patterns":["datahub_usage_event-*"],"priority":100,"last_updated_time":1659028360937}]}}}
>>> creating _template/datahub_usage_event_index_template ...
{
"index_patterns": ["datahub_usage_event-*"],
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"type": {
"type": "keyword"
},
"timestamp": {
"type": "date"
},
"userAgent": {
"type": "keyword"
},
"browserId": {
"type": "keyword"
}
}
},
"settings": {
"index.opendistro.index_state_management.rollover_alias": "datahub_usage_event"
}
}{"acknowledged":true}
>>> creating datahub_usage_event-000001 ...
{
"aliases": {
"datahub_usage_event": {
"is_write_index": true
}
}
}
2022/07/28 17:12:41 Command finished successfully.
Case 2: invalid index
- Nuke everything
- Deploy with
USE_AWS_ELASTICSEARCH
not set -> elasticsearch-setup-job fails (see log below) - Restart GMS
- Result analytics not working; but there is a configuration hint in elasticsearch-setup-job logs
elasticsearch-setup-job log
2022/07/28 17:20:49 Waiting for: https://xxx.es.amazonaws.com:443
2022/07/28 17:20:49 Received 200 from https://xxx.es.amazonaws.com:443
>>> failed to GET _ilm/policy/datahub_usage_event_policy (401) !
... looks like AWS OpenSearch is used; please set USE_AWS_ELASTICSEARCH env value to true
2022/07/28 17:20:49 Command exited with error: exit status 1
- Redeploy with correctly set
USE_AWS_ELASTICSEARCH=true
- Result: elasticsearch-setup-job runs successfully, analytics now working correctly
elasticsearch-setup-job log
2022/07/28 17:26:10 Received 200 from https://xxx.es.amazonaws.com:443
>>> creating _opendistro/_ism/policies/datahub_usage_event_policy ...
{
"policy": {
"policy_id": "datahub_usage_event_policy",
"description": "Datahub Usage Event Policy",
"default_state": "Rollover",
"schema_version": 1,
"states": [
{
"name": "Rollover",
"actions": [
{
"rollover": {
"min_index_age": "1d"
}
}
],
"transitions": [
{
"state_name": "ReadOnly",
"conditions": {
"min_index_age": "7d"
}
}
]
},
{
"name": "ReadOnly",
"actions": [
{
"read_only": {}
}
],
"transitions": [
{
"state_name": "Delete",
"conditions": {
"min_index_age": "60d"
}
}
]
},
{
"name": "Delete",
"actions": [
{
"delete": {}
}
],
"transitions": []
}
],
"ism_template": {
"index_patterns": [
"datahub_usage_event-*"
],
"priority": 100
}
}
}{"_id":"datahub_usage_event_policy","_version":1,"_primary_term":1,"_seq_no":0,"policy":{"policy":{"policy_id":"datahub_usage_event_policy","description":"Datahub Usage Event Policy","last_updated_time":1659029170348,"schema_version":1,"error_notification":null,"default_state":"Rollover","states":[{"name":"Rollover","actions":[{"rollover":{"min_index_age":"1d"}}],"transitions":[{"state_name":"ReadOnly","conditions":{"min_index_age":"7d"}}]},{"name":"ReadOnly","actions":[{"read_only":{}}],"transitions":[{"state_name":"Delete","conditions":{"min_index_age":"60d"}}]},{"name":"Delete","actions":[{"delete":{}}],"transitions":[]}],"ism_template":[{"index_patterns":["datahub_usage_event-*"],"priority":100,"last_updated_time":1659029170348}]}}}
>>> creating _template/datahub_usage_event_index_template ...
{
"index_patterns": ["datahub_usage_event-*"],
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"type": {
"type": "keyword"
},
"timestamp": {
"type": "date"
},
"userAgent": {
"type": "keyword"
},
"browserId": {
"type": "keyword"
}
}
},
"settings": {
"index.opendistro.index_state_management.rollover_alias": "datahub_usage_event"
}
}{"acknowledged":true}
>>> deleting invalid datahub_usage_event ...
{"acknowledged":true}
>>> creating datahub_usage_event-000001 ...
{
"aliases": {
"datahub_usage_event": {
"is_write_index": true
}
}
}
2022/07/28 17:26:11 Command finished successfully.
Case 3: no-change
- Redeploy with some unrelated bogus change
- Result: analytics still working
elasticsearch-setup-job log
2022/07/28 17:28:32 Waiting for: https://xxx.es.amazonaws.com:443
2022/07/28 17:28:32 Received 200 from https://xxx.es.amazonaws.com:443
>>> _opendistro/_ism/policies/datahub_usage_event_policy already exists β
>>> _template/datahub_usage_event_index_template already exists β
>>> datahub_usage_event-000001 already exists β
2022/07/28 17:28:33 Command finished successfully.
βοΈ Checklist
- [x] The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
- [x] Links to related issues
- [x] Tests for the changes have been added/updated (not applicable)
- [x] Docs related to the changes have been added/updated (adding several comments into the script itself)
- [x] For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub (no downtime expected)
Unit Test Results (build & test)
584 testsβ Β±0βββ580 :heavy_check_mark: Β±0βββ12m 48s :stopwatch: -7s 143 suites Β±0βββββββ4 :zzz: Β±0β 143 filesββ Β±0βββββββ0 :x: Β±0β
Results for commit 08736725.βΒ± Comparison against base commit 9e7bd1a8.
:recycle: This comment has been updated with latest results.