data-prepper icon indicating copy to clipboard operation
data-prepper copied to clipboard

[BUG] Custom ISM policy isnt injected

Open LHozzan opened this issue 10 months ago • 5 comments

Describe the bug On fresh start (DataPrepper against fresh empty initialized OpenSearch DB) we expecting, that DataPrepper inject our custom ISM policy + custom index template.

From logs from DataPrepper I can see, that the index template is managed, but ISM policy not.

Expected behavior If custom ISM policy is used, must be showed in ISM management GUI and ISM must managing the OTel indices.

Environment (please complete the following information):

  • DataPrepper Docker image v 2.6.2
  • OpenSearch Docker image v2.11.1

Additional context

/usr/share/data-prepper/config/data-prepper-config.yaml

ssl: false

/usr/share/data-prepper/otel-span-index-template.json

{
  "index_patterns": ["otel-v1-apm-span-*"],
  "version": 1,
  "template": {
    "settings": {
      "plugins.index_state_management.rollover_alias": "otel-v1-apm-span"
    }
  }
}

/usr/share/data-prepper/otel-span-ism-policy.json

{
  "policy": {
    "policy_id": "otel-span",
    "description": "Managing raw spans for trace analytics",
    "default_state": "current_write_index",
    "states": [
      {
        "name": "current_write_index",
        "actions": [
            {
                "rollover": {
                    "min_size": "10gb",
                    "min_index_age": "24h"
                }
            }
        ],
        "transitions": [
            {
                "state_name": "delete",
                "conditions": {
                    "min_index_age": "3d"
                }
            }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {}
          }
        ]
      }
    ],
    "ism_template": [
      {
        "index_patterns": ["otel-v1-apm-span-*"]
      }
    ]
  }
}

/usr/share/data-prepper/pipelines/pipelines.yaml

entry-pipeline:
  workers: 4
  delay: "100"
  source:
    otel_trace_source:
      ssl: true
      sslKeyCertChainFile: "/usr/share/data-prepper/server.crt"
      sslKeyFile: "/usr/share/data-prepper/server.key"
      port: 21890
      authentication:
        http_basic:
          username: REDACTED
          password: REDACTED
  buffer:
    bounded_blocking:
      buffer_size: 1024
      batch_size: 256
  sink:
    - pipeline:
        name: "raw-pipeline"
    - pipeline:
        name: "service-map-pipeline"

raw-pipeline:
  source:
    pipeline:
      name: "entry-pipeline"
  processor:
    - otel_traces:
  sink:
    - opensearch:
        #index_type: trace-analytics-raw
        index_type: custom
        index: otel-v1-apm-span
        hosts: [ https://os-endpoint.REDACTED:9200 ]
        cert: "/usr/share/data-prepper/root-ca.pem"
        template_file: "/usr/share/data-prepper/otel-span-index-template.json"
        ism_policy_file: "/usr/share/data-prepper/otel-span-ism-policy.json"
        username: REDACTED
        password: REDACTED

service-map-pipeline:
  delay: "100"
  source:
    pipeline:
      name: "entry-pipeline"
  processor:
    - service_map:
  sink:
    - opensearch:
        index_type: trace-analytics-service-map
        hosts: [ https://os-endpoint.REDACTED:9200 ]
        cert: "/usr/share/data-prepper/root-ca.pem"
        username: REDACTED
        password: REDACTED

LHozzan avatar Apr 23 '24 09:04 LHozzan

Working fine in DataPrepper v 2.7.0, but I see different settings in template, when I get it from OpenSearch:

GET _template/otel-v1-apm-span-index-template

{
  "otel-v1-apm-span-index-template": {
    "order": 0,
    "version": 1,
    "index_patterns": [
      "otel-v1-apm-span-*"
    ],
    "settings": {
      "index": {
        "opendistro": {
          "index_state_management": {
            "rollover_alias": "otel-v1-apm-span"
          }
        }
      }
    },
    "mappings": {
      "_source": {
        "enabled": true
      },
      "dynamic_templates": [
        {
          "resource_attributes_map": {
            "path_match": "resource.attributes.*",
            "mapping": {
              "type": "keyword"
            }
          }
        },
        {
          "span_attributes_map": {
            "path_match": "span.attributes.*",
            "mapping": {
              "type": "keyword"
            }
          }
        }
      ],
      "date_detection": false,
      "properties": {
        "traceId": {
          "ignore_above": 256,
          "type": "keyword"
        },
        "kind": {
          "ignore_above": 128,
          "type": "keyword"
        },
        "traceGroupFields": {
          "type": "object",
          "properties": {
            "endTime": {
              "type": "date_nanos"
            },
            "durationInNanos": {
              "type": "long"
            },
            "statusCode": {
              "type": "integer"
            }
          }
        },
        "traceGroup": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "serviceName": {
          "type": "keyword"
        },
        "parentSpanId": {
          "ignore_above": 256,
          "type": "keyword"
        },
        "spanId": {
          "ignore_above": 256,
          "type": "keyword"
        },
        "name": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "startTime": {
          "type": "date_nanos"
        },
        "links": {
          "type": "nested"
        },
        "endTime": {
          "type": "date_nanos"
        },
        "durationInNanos": {
          "type": "long"
        },
        "events": {
          "type": "nested",
          "properties": {
            "time": {
              "type": "date_nanos"
            }
          }
        },
        "status": {
          "type": "object",
          "properties": {
            "code": {
              "type": "integer"
            },
            "message": {
              "type": "keyword"
            }
          }
        }
      }
    },
    "aliases": {}
  }
}

So, can somebody explain for me, what is correct?

Variant A (our settings):

...
      "template": {
        "settings": {
          "plugins.index_state_management.rollover_alias": "otel-v1-apm-span"
        }
      }
...

OR

Variant B (settings from OpenSearch after DataPrepper injection):

...
    "settings": {
      "index": {
        "opendistro": {
          "index_state_management": {
            "rollover_alias": "otel-v1-apm-span"
          }
        }
      }
    }
...

?

Thank you!

LHozzan avatar Apr 24 '24 06:04 LHozzan

@LHozzan , I want to clarify your problem. Your ISM policy is working, correct? But, you are seeing a different setting?

Data Prepper still uses the opendistro setting to retain compatibility with OpenDistro clusters. You can see this below.

https://github.com/opensearch-project/data-prepper/blob/48d8bc379f337b7ecab06cea4c25d8240eb00fb4/data-prepper-plugins/opensearch/src/main/java/org/opensearch/dataprepper/plugins/sink/opensearch/index/IndexConstants.java#L23

OpenSearch ISM should be treating these the same until they remove the opendistro key.

dlvenable avatar Jun 18 '24 19:06 dlvenable

Hi @dlvenable .

Your ISM policy is working, correct?

Yes, our custom ISM is working fine with DataPrepper v2.7.0, but same state isnt working with previous DP version (2.6.2). The ISM must be connected to the index template via index_patterns attribute, thus we must importing the index template as well.

But, you are seeing a different setting?

Yes, we importing ISM policy in specific state. But the state will be changed after importing to the OpenSearch (= when I comparing our state BEFORE importing and state from OpenSearch AFTER importing, they are different).

And here is my confusion:

  • our state is imported correctly, but changed after import to the OpenSearch (variant A imported, but changed to variant B)
  • if I take state AFTER import to OpenSearch and attempt to use it as a template for DP in upgrade process (DP v previous to DP v new), this isnt imported at all (variant B isnt imported, but silently dropped)

Why?

I expecting, that if I import some objects to the OpenSearch and they have some older format, OpenSearch "upgrade them on the fly" and store upgraded version. So, upgraded version is better to use, because if I importing this upgraded version, no changes occurred, I think. What I see is, that if I take the "upgraded version", DP looks like importing it, but dont made it.

Data Prepper still uses the opendistro setting to retain compatibility with OpenDistro clusters. You can see this below.

I see it and this is the point:

  • if I using this setting in ISM, DP not import it at all
  • if I using different setting (variant A), DP import it, but OpenSearch change it to opendistro setting

It seems, that the bug was solved in DP v2.7.0 . I hope, that the bug stay solved and not occurred anymore :). Problem is, that we post the bug when is too late and we spot, that OpenSearch have very big index with all tracing data, aka index wasnt rolled over, due to missing (= expected, but not imported) ISM template.

LHozzan avatar Jun 19 '24 13:06 LHozzan