micrometer icon indicating copy to clipboard operation
micrometer copied to clipboard

Elastic registry: allow index name without date suffix

Open bodin opened this issue 4 years ago • 6 comments

Please describe the feature request. I would like the elastic registry to support a plain index name without injecting the month based suffix. Currently the only option is to provide the date format and the index prefix. It assumes micrometer will be managing / owning the rollover.

Rationale We are using datastreams and ILM predefined on our elastic cluster and do not want micrometer to manage / own that portion. Since elastic offers size and time based index rollover - this (in my opinion) is something better left to the cluster and not the micrometer library.

Additional context https://github.com/micrometer-metrics/micrometer/issues/2058 I think pulling more functionality into micrometer is adding to the complexity of the library and maybe not the right direction. Also, it may not stay compatible as versions of ealsticsearch change with the internal representations of ILM.

I can create a PR if helpfull - but would ask for guidance on backwards compatibly requirements / concerns.

bodin avatar Jul 19 '21 20:07 bodin

If I understand your request correctly, I think you should be able to do this by setting the dateFormat and the indexDateSeparator to empty string. Micrometer does something like this in ElasticMeterRegistry:

String index = "metrics";
String indexDateSeparator = "";
String dateFormat = "";
DateTimeFormatter indexDateFormatter = DateTimeFormatter.ofPattern(dateFormat);

ZonedDateTime dt = ZonedDateTime.ofInstant(Instant.now(), ZoneOffset.UTC);
System.out.println(index + indexDateSeparator + indexDateFormatter.format(dt)); // prints metrics

jonatan-ivanov avatar Jul 19 '21 21:07 jonatan-ivanov

Thanks for the suggestion - it does actually work!

My concern with this approach is it seems unsupported. If the property comes in as null vs blank, or someone adds a different default in the future - then the date will revert back to the normal month based approach.

default String indexDateFormat() {
    return getString(this, "indexDateFormat")
            .invalidateWhen(format -> {
                if (format == null) {
                    return false;
                }

                try {
                    DateTimeFormatter.ofPattern(format);
                    return false;
                } catch (IllegalArgumentException ignored) {
                    return true;
                }
            }, "invalid date format", InvalidReason.MALFORMED)
            .orElse("yyyy-MM");
}

bodin avatar Jul 19 '21 21:07 bodin

I don't think this was an intended use-case; it's more like a side-effect due to how DateTimeFormatter works. I think we can add tests and document this behavior to ensure we keep it in the future (maybe validation too).

@bodin, @shakuzen What do you think?

jonatan-ivanov avatar Jul 20 '21 19:07 jonatan-ivanov

@bodin thank you for raising the issue.

For some history, see #1331 which added an option to override the index naming entirely, but it does require extending the ElasticMeterRegistry which is less than ideal.

#2058 I think pulling more functionality into micrometer is adding to the complexity of the library and maybe not the right direction. Also, it may not stay compatible as versions of ealsticsearch change with the internal representations of ILM.

I agree. And I think the reason overriding the index entirely was left as a more difficult exercise than an additional method in ElasticConfig was to keep the configuration options simpler and more understandable. It gets complicated quickly when some options are only meaningful in the presence or absence of other options being set. Also, at the time, the date prefix was by and large the way everyone was doing things. I think that is changing now with ILM, but I also don't have numbers or active experience with elasticsearch to support or refute that claim.

So I think there are a couple things we want to work on. We should figure out how index management will be for a majority of Micrometer/Elasticsearch users going forward and if that warrants a change in our defaults/suggested way of shipping metrics to elasticsearch. Regardless of that, we should probably make this use case more official and documented, as @jonatan-ivanov suggests. Whether that is by officially supporting the current way (hack?) or some other way, I'm open to ideas.

shakuzen avatar Jul 21 '21 04:07 shakuzen

Thanks for the feedback - I did indeed end up overriding that method to customize the index name, cool to see that was a purposeful change. I opened the ticket because it was the only customization I was not able to do through existing property based configuration.

I definitely agree that date based suffixes was the gold standard for time based data in the past, and I still have a few usecases still using that method. For myself at least, Datastreams are a nice replacement for the manual work I was doing maintaining the indexes and still produce the underlying date based naming convention on the actual indexes they create

https://www.elastic.co/guide/en/elasticsearch/reference/current/data-streams.html

Going forward, I am happy using the methods described in this thread and if there is a plan to revisit the default names in a future version, then documentation and test cases for the current 'hack' may not even be needed since it could change anyway.

Thanks all for the discussion!

bodin avatar Jul 21 '21 13:07 bodin

I am also currently facing an issue with integrating micrometer with the data streams. I want to point out, that allowing for an empty index date suffix will not be enough to fix this issue. From data streams documentation:

You cannot add new documents to a data stream using the index API’s PUT //_doc/<_id> request format. To specify a document ID, use the PUT //_create/<_id> format instead. Only an op_type of create is supported. To add multiple documents with a single request, use the bulk API. Only create actions are supported.

it seems that as for now, the ElasticRegistry supports only index op_type:

    private final String indexLine;

[...]

protected ElasticMeterRegistry(ElasticConfig config, Clock clock, ThreadFactory threadFactory, HttpSender httpClient) {
        super(config, clock);
        config().namingConvention(new ElasticNamingConvention());
        this.config = config;
        indexDateFormatter = DateTimeFormatter.ofPattern(config.indexDateFormat());
        this.httpClient = httpClient;
        if (StringUtils.isNotEmpty(config.pipeline())) {
            indexLine = "{ \"index\" : {\"pipeline\":\"" + config.pipeline() + "\"} }\n";
        } else {
            indexLine = "{ \"index\" : {} }\n";
        }

        start(threadFactory);
    }

[...]

    String writeDocument(Meter meter, Consumer<StringBuilder> consumer) {
        StringBuilder sb = new StringBuilder(indexLine);
        String timestamp = generateTimestamp();
        String name = getConventionName(meter.getId());
        String type = meter.getId().getType().toString().toLowerCase();
        sb.append("{\"").append(config.timestampFieldName()).append("\":\"").append(timestamp).append('"')
                .append(",\"name\":\"").append(escapeJson(name)).append('"')
                .append(",\"type\":\"").append(type).append('"');

        List<Tag> tags = getConventionTags(meter.getId());
        for (Tag tag : tags) {
            sb.append(",\"").append(escapeJson(tag.getKey())).append("\":\"")
                    .append(escapeJson(tag.getValue())).append('"');
        }

        consumer.accept(sb);
        sb.append('}');

        return sb.toString();
    }

Error from elasticsearch api:

   "error":{
      "type":"illegal_argument_exception",
      "reason":"only write ops with an op_type of create are allowed in data streams"
   }

xsw2-2wsx avatar Jan 30 '22 14:01 xsw2-2wsx