elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Add better support for metric data types (TSDB)

Open imotov opened this issue 4 years ago • 6 comments

Phase 0 - Inception

  • [x] Obtain schemas annotated with dimensions and metrics from the Metrics team (small) @nik9000
  • [x] Prototyping Lucene Data Pull Mechanism(medium) @imotov
  • [x] Prototyping Data Pull Mechanism in elasticsearch @imotov

Phase 1 - Mappings

  • [x] Add time_series_dimension mapping parameter to fields
    • [X] #74450 @csoulios
    • [X] #74939 @csoulios
    • [X] #78012 @csoulios
  • [x] #76766 @csoulios
  • [X] #78790 @imotov
  • [x] #79136 @weizijun

Phase 2 - Ingest

  • [x] Dimension-based tsid generator

    • [x] #77154 @nik9000
    • [x] Add TSDB-specific tests
      • [X] #78208 @nik9000
      • [X] #78042 @nik9000
      • [X] #78038 @nik9000
      • [X] #78034 @nik9000
      • [X] #78028 @nik9000
      • [X] #78022 @nik9000
    • [x] #80276 @csoulios
    • [x] #81382 @csoulios (prototype)
    • [x] #81998 @csoulios
  • [x] Routing

    • [x] #77211 @nik9000
    • [x] #77731 @nik9000
    • [x] #79384 @nik9000
    • [x] #79520 @nik9000
    • [x] #81125 @csoulios
    • [x] #79826 (Speed up xcontent filtering) @weizijun
    • [x] Have a good hard look at the switch statement in BulkOperation. Maybe we can make this simpler.
      • [X] #79394 @nik9000
      • [X] #79472 @nik9000
      • [x] #80624
    • [x] #81436 @csoulios
    • [x] Test and fix get-by-id #82633 (See linked issue for greater description of sub-points)
      • [x] Initial implementation of _id for tsid (#82633)
      • [x] Generate better error messages when _id is automatically generated (#84903, #84962)
      • [x] Improve error messages on version conflict to include _tsid and @timestamp (#84957)
      • [x] Size
        • [x] Investigate flipping @timestamp component of the _id from little endian to big endian. That should mean there are more common prefixes. #85008 cuts the size of the inverted index for _id by 37%. That's not a lot of the index in total, but it sure does feel good for such a small change.
      • [x] Misc
        • [x] Test TSDB's _id in RecoverySourceHandlerTests.java and EngineTests.java #84996, #85055
        • [x] Make it possible to modify @timestamp or dimensions in reindex #86647 + #86704
        • [x] Test _id with the security create_doc privilege. Can a user with create_doc (only) ingest new TSDB docs? Does create_doc prevent a user from overwriting an existing TSDB doc? (create_doc relies on the OpType of the IndexRequest, which is automatically set to CREATE for docs with auto-generated ids) #86638
  • [x] Handling Time Boundaries

    • [x] #78291 (Added start_time, end_time index settings ) @weizijun
    • [x] Make time boundaries required in tsdb indices @weizijun https://github.com/elastic/elasticsearch/pull/81146
    • [x] Replace hard check for index_mode=TIME_SERIES with bounds checking on start and end time @nik9000 https://github.com/elastic/elasticsearch/pull/81263
    • [x] Tests for nanosecond timeprecision timestamp just beyond the limit
    • [x] #82079 @martijnvg
    • [x] Automated update of index time boundaries on index rollover @martijnvg
    • [x] #83517 (@martijnvg )
    • [x] Adjust get data stream api to include index_mode and per backing index the start and end time if data stream is tsdb. #83518
    • [x] Automatically skip shards of backing indices with time ranges (based on index.time_series.start_time and index.time_series.end_time index settings) that don't match with the @timestamp range in a search request. #85162 (@martijnvg)
  • [ ] Other tasks

    • [x] #79826 @weizijun
    • [x] Compile a standard data set for comparative speed and space benchmarking (@nik9000) https://github.com/elastic/rally-tracks/pull/222
    • [x] #82238 @imotov
    • [ ] Handling histograms
    • [x] Rewrite tsdb benchmark to use time series data streams with ilm policy. Instead of indexing into a regular index. @martijnvg
    • [x] Figure out how to parse source only once for determining the right backing index and index routing. #84046 @martijnvg
    • [x] Implement migrating existing data streams to data streams with time series index mode. #83520 @martijnvg
    • [x] Reconsider how time series data streams are enabled in templates. @martijnvg The current index_mode setting isn't good enough. It requires additional config to be specified (time_series_dimension attribute in mappings and index.routing_path as index settings) elsewhere and it doesn't allow the data stream tsdb features (routing based on @timestamp field) to be enabled without enabled the index level tsdb features.
    • [x] A template will create time series data stream if index.mode setting is set to time_series.
    • [x] Autogenerate index.routing_path index setting if not defined in composable index template that creates a tsdb data stream. All mapped fields of type keyword and time_series_dimension enabled will be included in the generated index.routing_path index setting. #86790 (@martijnvg) ~~- [ ] The index.routing_path index setting generation doesn't kick in when index.mode and dimension fields are defined in component templates. (@martijnvg).~~

Phase 2.1 Ingest follow ups

  • [ ] Build the _id from dimension values
  • [ ] Investigate moving timestamp to the front of the _id to automatically get an optimization on _id searches. Not sure if worth it - but possible. #84928 could be an alternative
  • [ ] Bring back something in the spirit of the append-only optimization but that works for tsdb. That's super improve write performance. #84771 is a partial prototype
  • [ ] We store the _id in lucene stored fields. We could regenerate it from the _source or from doc values for the @timestamp and the _tsid. That'd save some bytes per document.
  • [ ] Move IndexRequest#autoGeneratId? It's a bit spook where it is but I don't like it any other place.
  • [ ] Improve error messages in _update_by_query when modifying the dimensions or @timestamp
  • [ ] On translog replay and recovery and replicas we regenerate the _id and assert that it matches the _id from the primary. Should we? Probably. Let's make sure.
  • [ ] Add tsdb benchmarks to the nightlies
  • [ ] Document best practices for using dimensions-based ID generator including how to use this with component templates

Phase 3.1 QL storage API (Postponed)

  • [x] Create simple time series reader
    • [X] #79197 @nik9000
    • [x] #79691 @imotov
  • [ ] Reimplement QL storage API for TSDB database (depends on completion of Phase 2 and 3.2) (Postponed)

Phase 3.2 - Search MVP

  • [x] Distributed nested delayed execution framework
    • [x] #82129 @imotov
    • [x] #83492 @imotov
    • [x] #85011 @imotov
  • [ ] Treating data stream/index as a dimension
  • [ ] Aggregation results filtering
  • [ ] Retrieve the last value for a time series metric within a parent bucket
  • [x] Time series aggregation
  • [ ] Rate Function
  • [ ] Add a new histogram field subtype to support Prometheus-style histograms
  • [ ] #85523
  • [ ] Should the _tsid agg return doc_counts by default?

Phase 3.3 - Rollup / Downsampling

  • [x] #85708 @csoulios
    • Extract rollup configuration (dimensions, metrics) from index mapping
    • Create rollup index (settings and mapping)
    • Traverse source index using TimeSeriesIndexSearcher and compute rollups docs and add them to the rollup index
    • Finalize action: publish index metadata, modify data stream, clean up temp index
  • [x] #87269 @csoulios
    • Use the updated rollup config
    • Revisit validations before invoking rollup process
  • [x] #90029 @csoulios
  • [x] Query downsampled indices, add validations for:
    • [x] #89252 @salvatore-campagna
    • Intervals: fixed_interval vs calendar_interval
    • time_zone
    • date_histogram resolution
  • [x] Field Caps API
    • [x] #87849 @csoulios
    • [x] #88695 @csoulios
      • Expose information about if a field belongs to only time-series indices when querying multiple indices
      • Shorten the response when some indices don't map fields as the same time series parameter - right now it's a list of indices which is nice but kibana only needs to know if the list is non-empty
  • [ ] Misc
    • [x] #87554 @csoulios
    • [x] #87929 @salvatore-campagna
    • [ ] Make rollup task cancellable #88496 @weizijun
    • [x] #88534 @salvatore-campagna
    • [ ] Support text field labels
    • [x] #88818 @salvatore-campagna
    • [ ] Handle rollup failures
    • [x] Update tsdb rally track to add benchmarks for downsampling https://github.com/elastic/rally-tracks/pull/316 @salvatore-campagna

Phase 3.4 - TSID aggs

  • [ ] Update min, max, sum, avg pipeline aggs for intermediate result filtering optimization
  • [ ] Sliding window aggregation
  • [ ] A way to filter to windows within the sliding window. Like "measurements take in the last 30 seconds of the window".
  • [ ] Open transform issue for newly added time series aggs
  • [ ] Benchmarks for the tsid agg

Phase 3.5 - Downsampling follow ups

  • [ ] SQL support for downsampling

Phase 4.0 - Compression

  • [ ] Synthetic _source @nik9000 #86603
  • [ ] Optimization of merge policies (#87684)
  • [ ] Deltas of deltas compression
  • [ ] What about sequence number?

Phase 5.0 - Follow-ups and Nice-to-have-s

  • [ ] Default the setting's value to all of the keyword dimensions
  • [ ] Support shard splitting on time_series indices
  • [ ] Make an object or interface for _id's values. Right now it's a String that we encode with Uid.encodeId. That was reasonable. Maybe it still is. But it feels complex and for tsdb who's _id is always some bytes. And encoding it also wastes a byte about 1/128 of the time. It's a common prefix byte so this is probably not really an issue. But still. This is a big change but it'd make ES easier to read. Probably wouldn't really improve the storage though.
  • [ ] Figure out how to specify tsdb settings in component templates. For example index.routing_path can be specified in a composable index template if data stream template' index_mode is set to time_series. But if this setting is specified in a component template then it is required to also set the index.mode index setting. This feels backwards. @martijnvg
  • [ ] In order to retrieve the routing values (defined in index.routin_path), the source needs to be parsed on coordinating node. However in the case that an ingest pipeline is executed this, then the source of document will be parsed for the second time. Ideally the routing values should be extracted when ingest is performed. Similar to how the @timestamp field is already retrieved from a document during pipeline execution.
  • [ ] In order to determine the backing index a document should be to, a timestamp is parsed into Instant. The format being used is: strict_date_optional_time_nanos||strict_date_optional_time||epoch_millis. This to allow regular data format, data nanos date format and epoch since mills defined as string. We can optimise the data parsing if we know the exact format being used. For example if on data stream there is parameter that indices that exact data format we can optimise parsing by either using strict_date_optional_time_nanos, strict_date_optional_time or epoch_millis.

imotov avatar Jun 28 '21 21:06 imotov

Pinging @elastic/es-analytics-geo (Team:Analytics)

elasticmachine avatar Jun 29 '21 23:06 elasticmachine

Maybe something like https://github.com/prometheus-community/avalanche can be used for benchmarking.

tobiasstadler avatar Dec 31 '21 12:12 tobiasstadler

go go go~

weizijun avatar Jun 01 '22 09:06 weizijun

Great job!

rasonyang avatar Jul 29 '22 05:07 rasonyang

Sorry for my newbie question. Is this the same as https://www.elastic.co/guide/en/elasticsearch///reference/master/tsds.html ? Thanks

oatkiller avatar Sep 19 '22 17:09 oatkiller

@oatkiller Yes. It's the same thing. :) (I helped write the initial docs.)

Hope you enjoy Elastic! It's an awesome place to work.

jrodewig avatar Sep 19 '22 18:09 jrodewig

@martijnvg Hi! I am working on a feature on Fleet UI to enable TSDB index setting, and trying to leave routing_path empty to rely on elasticsearch's auto generation.

I'm getting this error when trying to set index.mode=time_series, tried on index template and also component template. Is there any way to work around this error and trigger the auto generation? Thanks!

   "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[index.mode=time_series] requires a non-empty [index.routing_path]"

juliaElastic avatar Nov 09 '22 16:11 juliaElastic

Hey @juliaElastic, can you point me to the composable index templates and component templates? Composable index templates is the place where this setting can be used. Typically with component templates, not all settings / mappings are present there and each component template needs to be valid on its own. So if index.mode index setting has been specified in one component and mappings or index.routing_path is in another component or composable index template then storing the component template with index.mode index setting fails, because during validation on its own isn't valid, due to the index.mode index setting validation failing that there is no index.routing_path. Also, in the case index.routing_path is missing, the auto generation of the index.routing_path setting is only performed for composable index templates.

martijnvg avatar Nov 10 '22 08:11 martijnvg

I've tried to add to integrations index template here: image

As discussed on slack, the setting works fine on installing a package, and the routing_path is generated correctly on the data stream: image

I did see some errors when trying to add TSBD on existing templates, will check that again.

juliaElastic avatar Nov 10 '22 09:11 juliaElastic

@martijnvg So I managed to add the "index.mode=time_series" setting without routing_path to the metrics-system.cpu Index Template without an issue, however I am running to an error when trying to modify the Component Template metrics-system.cpu@custom, which is the parent of the Index Template.

Is there any workaround for this issue?

{
  "name": "ResponseError",
  "meta": {
    "body": {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "updating component template [metrics-system.cpu@custom] results in invalid composable template [metrics-system.cpu] after templates are merged"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "updating component template [metrics-system.cpu@custom] results in invalid composable template [metrics-system.cpu] after templates are merged",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "[index.mode=time_series] requires a non-empty [index.routing_path]"
        }
      },
      "status": 400
    },
    "statusCode": 400,
    "headers": {
      "x-opaque-id": "59e2d33e-d6c8-4ed4-8d4a-14c412f64871;kibana::management:",
      "x-elastic-product": "Elasticsearch",
      "content-type": "application/json;charset=utf-8",
      "content-length": "549"
    },
    "meta": {
      "context": null,
      "request": {
        "params": {
          "method": "PUT",
          "path": "/_component_template/metrics-system.cpu%40custom",
          "body": "{\"template\":{\"settings\":{},\"mappings\":{\"properties\":{\"dummy\":{\"type\":\"text\"}}}},\"_meta\":{\"package\":{\"name\":\"system\"},\"managed_by\":\"fleet\",\"managed\":true}}",
          "querystring": "",
          "headers": {
            "user-agent": "Kibana/8.6.0",
            "x-elastic-product-origin": "kibana",
            "authorization": "Basic ZWxhc3RpYzpjaGFuZ2VtZQ==",
            "x-opaque-id": "59e2d33e-d6c8-4ed4-8d4a-14c412f64871;kibana::management:",
            "x-elastic-client-meta": "es=8.4.0p,js=16.18.1,t=8.2.0,hc=16.18.1",
            "content-type": "application/vnd.elasticsearch+json; compatible-with=8",
            "accept": "application/vnd.elasticsearch+json; compatible-with=8",
            "content-length": "154"
          }
        },
        "options": {
          "opaqueId": "59e2d33e-d6c8-4ed4-8d4a-14c412f64871;kibana::management:",
          "headers": {
            "x-elastic-product-origin": "kibana",
            "user-agent": "Kibana/8.6.0",
            "authorization": "Basic ZWxhc3RpYzpjaGFuZ2VtZQ==",
            "x-opaque-id": "59e2d33e-d6c8-4ed4-8d4a-14c412f64871",
            "x-elastic-client-meta": "es=8.4.0p,js=16.18.1,t=8.2.0,hc=16.18.1"
          }
        },
        "id": 2
      },
      "name": "elasticsearch-js",
      "connection": {
        "url": "http://localhost:9200/",
        "id": "http://localhost:9200/",
        "headers": {},
        "status": "alive"
      },
      "attempts": 0,
      "aborted": false
    },
    "warnings": null
  }
}
image image

juliaElastic avatar Nov 15 '22 12:11 juliaElastic

Initial TSDB support has been added a while ago. I moved the leftover tasks to #98877

martijnvg avatar Aug 25 '23 14:08 martijnvg