OpenSearch
OpenSearch copied to clipboard
Star tree mapping changes
Description
This PR contains the changes for star tree field mapping with feature flag protection. To make the changes extensible to other multi-field/composite/datacube type of indices in future, I've generalized the implementation under 'CompositeIndex'.
Mappings have a new section 'composite' under which multi field mappings such as 'star tree' can be defined. All the fields associated with metrics and dimensions must be present in 'properties' section. [ This is even applicable for update mapping API - but right now , its blocked for star tree - star tree can be only specified during creation of index ]
Min version :
"mappings": {
"dynamic": "strict",
"_source": {
"enabled": true
},
"composite": {
"startree1": {
"type": "star_tree",
"config": {
"ordered_dimensions": [
{
"name": "@timestamp"
},
{
"name": "status"
}
],
"metrics": [
{
"name": "size"
},
{
"name": "request_rate"
}
]
}
}
},
"properties": {
"@timestamp": {
"format": "strict_date_optional_time||epoch_second",
"type": "date"
},
--------
}
}
}
And the defaults will be filled for the above fields.
NOTE : We will tune the defaults throughout the development of the star tree index.
Defaults : Timestamp field intervals = [ Minute, Hour ] Default Metrics for each metric field = [ SUM, COUNT, AVG, MIN, MAX ]
Complete version :
"mappings": {
"dynamic": "strict",
"_source": {
"enabled": true
},
"composite": {
"startree1": {
"type": "star_tree",
"config": {
"ordered_dimensions": [
{
"name": "timestamp",
"calendar_intervals": [
"day",
"month"
]
},
{
"name": "size"
}
],
"metrics": [
{
"name": "size",
"stats": [
"sum",
"avg",
"min"
]
}
]
}
}
},
"properties": {
"@timestamp": {
"format": "strict_date_optional_time||epoch_second",
"type": "date"
}
-----------
}
}
Validations
Apart from basic validations based on user input :
- We will start with support for one field mapping under composite index , so technically one star tree index per source index.
- Maximum of 10 dimensions [will fine tune this]
- For date fields - maximum for 3 intervals
- All dimension fields and metric fields must be aggregation compatible [ doc values + field data supported ]
- We will add a limit on number of metrics later on.
Open questions
- Saw sort valiadations on 'shrink index' api etc - where I'll also add similar validations. What other index APIs we need to restrict creation / support of star tree ?
Related Issues
https://github.com/opensearch-project/OpenSearch/issues/13875 https://github.com/opensearch-project/OpenSearch/issues/14386
Check List
- [ ] Functionality includes testing.
- [ ] API changes companion pull request created, if applicable.
- [ ] Public documentation issue/PR created, if applicable.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.