beats icon indicating copy to clipboard operation
beats copied to clipboard

[Docs] Note that output.elasticsearch.index setting is ignored with ILM enabled

Open ycombinator opened this issue 6 years ago • 27 comments
trafficstars

Based on https://discuss.elastic.co/t/filebeat-7-0-0-rename-index/177435.

Chatting with @urso, he confirmed that Beats will ignore the output.elasticsearch.index setting that allows users to override the index name used for Beats events, when ILM is enabled (via setup.ilm.enabled: true in 7.0+ and output.elasticsearch.ilm.enabled: true in 6.6+).

We should document this limitation. IMO, we should document it in both places:

  • where we document the output.elasticsearch.index setting: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-template.html, and
  • where we document about setting up ILM in Beats: https://www.elastic.co/guide/en/beats/filebeat/current/ilm.html.

ycombinator avatar Apr 18 '19 15:04 ycombinator

I just ran into this issue. Very frustrating it's not documented.

What tipped me off this might be an issue are these entries in the filebeat log.

2019-05-01T11:36:23.204-0400 INFO [index-management] idxmgmt/std.go:223 Auto ILM enable success. 2019-05-01T11:36:23.249-0400 INFO [index-management.ilm] ilm/std.go:134 do not generate ilm policy: exists=true, overwrite=false 2019-05-01T11:36:23.249-0400 INFO [index-management] idxmgmt/std.go:238 ILM policy successfully loaded. 2019-05-01T11:36:23.249-0400 INFO [index-management] idxmgmt/std.go:361 Set setup.template.name to '{filebeat-7.0.0 {now/d}-000001}' as ILM is enabled. 2019-05-01T11:36:23.249-0400 INFO [index-management] idxmgmt/std.go:366 Set setup.template.pattern to 'filebeat-7.0.0-*' as ILM is enabled. 2019-05-01T11:36:23.249-0400 INFO [index-management] idxmgmt/std.go:400 Set settings.index.lifecycle.rollover_alias in template to {filebeat-7.0.0 {now/d}-000001} as ILM is enabled. 2019-05-01T11:36:23.249-0400 INFO [index-management] idxmgmt/std.go:404 Set settings.index.lifecycle.name in template to {filebeat-7.0.0 map[

jhughes-mc avatar May 01 '19 15:05 jhughes-mc

It realy happens :/

diogolmenezes avatar May 03 '19 19:05 diogolmenezes

The same logs @jhughes-mc posted ended up tipping me off as well, but this is NOT obvious at all from reading through the documentation.

JonasDeGendt avatar May 14 '19 13:05 JonasDeGendt

To make it worse, setup.ilm.enabled was 'true' per default in filebeat 7.2.0, very annoying, if you expect everything work as expected and then find out, that since a few weeks all entries land in one index...

llech avatar Jul 10 '19 12:07 llech

++ would be great to get this further documented.

ElasticStewart avatar Aug 07 '19 01:08 ElasticStewart

I disabled ILM by specifying these lines in docker container's file /usr/share/filebeat/filebeat.yml:

setup.ilm.enabled: false
ilm.enabled: false

I'm using docker.elastic.co/beats/filebeat:7.3.0 image.

Full command:

docker run \
    -dit \
    --restart=unless-stopped \
    --name filebeat \
    --user=root \
    --network host \
    -v /srv/filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro \
    -v /srv/filebeat/registry:/usr/share/filebeat/data/registry/filebeat:rw \
    docker.elastic.co/beats/filebeat:7.3.0

NullIsNot0 avatar Aug 20 '19 12:08 NullIsNot0

Is there any update? I don't wanna all my index have the same name. It does make sense index names based on it contents.

Like this:

output.elasticsearch: worker: 2 hosts: http://elastic-staging-ingest:9200 username: ${filebeat-elastic-username} password: ${filebeat-elastic-password} indices: - index: "filebeat-k8s-pubsub-%{+yyyy.MM.dd}" when.contains: input.type: "google-pubsub" - index: "filebeat-k8s-%{[kubernetes.labels.app_kubernetes_io/name]}-%{+yyyy.MM.dd}" when.contains: input.type: "container"

All this stuff with ILM enabled.

nerddelphi avatar Sep 12 '19 11:09 nerddelphi

Any updates?

ghost avatar Sep 13 '19 17:09 ghost

Bump. I encountered this undocumented behavior today and was very confusing and wasted time as I couldn't figure out why the index was being created with the default name rather than with the custom name specified in the configuration.

AbrahamLopez10 avatar Oct 08 '19 19:10 AbrahamLopez10

Without this feature I cannot use a Beats product that directly talks to Elasticsearch as I have to have both ILM and more than a single index. So I basically have to forward to Logstash first. Which does have this functionality:

https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/798

The-New-Guy avatar Nov 14 '19 21:11 The-New-Guy

Doesn't ILM make use of setup.ilm.rollover_alias as index name instead of output.elasticsearch.index?

#============================== Setup ILM =====================================

# Configure index lifecycle management (ILM). These settings create a write
# alias and add additional settings to the index template. When ILM is enabled,
# output.elasticsearch.index is ignored, and the write alias is used to set the
# index name.

# Enable ILM support. Valid values are true, false, and auto. When set to auto
# (the default), the Beat uses index lifecycle management when it connects to a
# cluster that supports ILM; otherwise, it creates daily indices.
#setup.ilm.enabled: auto

# Set the prefix used in the index lifecycle write alias name. The default alias
# name is 'filebeat-%{[agent.version]}'.
#setup.ilm.rollover_alias: "filebeat"

Source: https://www.elastic.co/guide/en/beats/filebeat/7.4/filebeat-reference-yml.html

From what I understand there it should be possible to use patterns and variables in the rollover_alias.

mback2k avatar Nov 15 '19 05:11 mback2k

@mback2k Wonderful. I don't know why I didn't bother to check the filebeat reference yml that comes with the product nor the last link the OP posted which is the actual documentation about it. I've been spending so much time bouncing around in the official docs I guess I got tunnel vision. I haven't tested this out yet but I'm sure you are right. Thanks for the assist.

To the maintainers of the official documentation, it would still be nice if this was documented in the filebeat template documentation as suggested by the OP. Or at least a small note with a link to the filebeat ILM settings. This does lead to confusion as seen above. Right now however, one problem is that when someone is researching this issue, they are likely to come across this GitHub issue which will leave many to believe this functionality isn't supported. If you click on the OP's very first link it takes you to a discussion where it is explicitly stated that this functionality is not supported. If you click on the second link you will not see any documentation about this behavior. Only in the 3rd link are different but related settings documented but most people at this point will have already concluded from the first and second link as well as the other comments on this issue that the functionality is simply not supported.

For those looking to use custom index names with ILM enabled, see here: https://www.elastic.co/guide/en/beats/filebeat/current/ilm.html

@ycombinator Would it be possible for you to update your original post to make it more clear that while the output.elasticsearch.index setting is ignored with ILM enabled, it is still possible using the ILM settings described in your 3rd link. I fear this issue is causing just as much confusion as the lack of documentation you are trying to address. Thanks.

The-New-Guy avatar Nov 15 '19 15:11 The-New-Guy

@The-New-Guy I can also see the following text in the Elasticsearch output configuration:

The index setting is ignored when index lifecycle management is enabled. If you’re sending events to a cluster that supports index lifecycle management, see Configure index lifecycle management to learn how to change the index name.

Source: https://www.elastic.co/guide/en/beats/filebeat/7.4/elasticsearch-output.html#index-option-es

But you are right, the warnings about this being ignored should probably be more highlighted and easier to find. I was running into the same issue myself, but since I am still stuck on 6.x I haven't tested this new behavior myself.

mback2k avatar Nov 15 '19 15:11 mback2k

I just spent 4 hours with this. Would be great if filebeat throws WARNING into the logs -- i.e. elasticsearch.index ignored when ILM is enabled.

LumirH avatar Nov 24 '19 05:11 LumirH

As part of updating the documentation, we will also want to update https://www.elastic.co/guide/en/beats/filebeat/current/configuration-template.html to indicate that setup.template.name and setup.template.pattern are used when the index option is used in ES output. When ILM is enabled, it creates an index template using the setup.ilm.rollover_alias as the index template name (and will ignore setup.template.name if specified).

ppf2 avatar Feb 19 '20 02:02 ppf2

Is there any update? I don't wanna all my index have the same name. It does make sense index names based on it contents.

Like this:

output.elasticsearch: worker: 2 hosts: http://elastic-staging-ingest:9200 username: ${filebeat-elastic-username} password: ${filebeat-elastic-password} indices:

  • index: "filebeat-k8s-pubsub-%{+yyyy.MM.dd}" when.contains: input.type: "google-pubsub"
  • index: "filebeat-k8s-%{[kubernetes.labels.app_kubernetes_io/name]}-%{+yyyy.MM.dd}" when.contains: input.type: "container"

All this stuff with ILM enabled.

Are there any updates? What to do if i want to use ilm + indices ....

vnazarenko avatar Feb 21 '20 12:02 vnazarenko

@urso Can you respond to @vnazarenko?

This issue is starting to mix up a couple of problems:

  • The docs about the index setting don't clearly indicate that the setting is not used when you enable ILM. I think it's there, but not in the config files, and not everywhere that we mention the index setting. This is a documentation problem.
  • Users want to be able to set indices dynamically. It looks like rollover_alias does support format strings, so you can name the index dynamically, but AFAIK you don't have fine control over the index naming like you do with the indices setting. This is a problem that the dev team needs to respond to. I can improve the docs by showing an example that includes a format string, so I'll add that to my to-do list.

dedemorton avatar Feb 21 '20 19:02 dedemorton

Comparing docs and the filebeat.reference.yml I indeed can't find the behavior being documented in always in our docs. The reference config file states this (link):

#============================== Setup ILM =====================================

# Configure index lifecycle management (ILM). These settings create a write
# alias and add additional settings to the index template. When ILM is enabled,
# output.elasticsearch.index is ignored, and the write alias is used to set the
# index name.

In addition we should also document the behavior with the index/indices settings in the elasticsearch output.

All in all there are 3 configuration namespaces that kind of interact sometimes in non-subtle ways:

  • setup.template
  • setup.ilm
  • output.elasticsearch

If ILM is configured, then output.elasticsearch.index/indices will be overwritten, because ILM requires us to use a write alias instead of templates. The template pattern must match the names generated via rollover_alias or index/indices. If the later is not the case, then templates (and ILM) policy are not applied.

Event fields CANNOT be used with settings in the setup namespace.

The interaction between the settings should be documented in each of the reference pages + the reference.yml files.

The fact that we have some interaction between these settings doesn't make it easier to figure this out just from reference documentation. @dedemorton I would propose to introduce another documentation section named 'Index Management'. The section would contain the current docs for template setup and index lifecycle management, plus additional docs explicitely explaining how the different settings do interact. Given that we also have some Kibana related setup we might even consider to combine them into a 'Stack Management' secion. WDYT?

Ingesting data and Index Management are separate tasks to Beats. Index Setup is done upfront before the first event is being published. This is what the filebeat setup command is recommended to be used for. Separating the two also allows for proper role based access to Elasticsearch. With Index Management being separated from the event publishing, no event fields can be used in any of the setup settings. Support for formatstrings is limited to agent name and version only (NOTE: we should check if this is documented as well). This limitation will not be resolved in Beats, but the team is working on some alternative solutions to overcome these limitations in the future.

As of today there are two common workarounds one can apply.

First of all, indices are managed via templates. By using a common naming scheme (or adapting the template pattern via setup.template.pattern) one can make multiple indices match the template. This will require ILM to be disabled (setup.ilm.enabled: false). The default template pattern is filebeat-7.6.0-* (if the filebeat version is 7.6.0). Changing the pattern to filebeat-* will widen the scope of matching index names to any index name that is prefixed with filebeat-. It is not really recommended to remove the agent version from the index name, as this can create mapping conflicts when updating Beats in the future or when running different beats versions at the same time. This approach also requires old templates to be removed during upgrade, making the upgrade process somewhat more delicate. If one does not want to modify how templates match, then the index naming scheme should be setup to be something like this: output.elasticsearch.index: filebeat-%{[agent.version]}-k8s-%{[some other field]}-%{+yyyy.MM.dd}. The setup.template.pattern only allows one pattern to be configured. One can export the template via filebeat export template and add additional patterns to the resulting JSON file. The resulting/modified template can be auto-loaded by filebeat by configuring the setup.template.json.* settings.

One limitation with ILM setup is that we requires a write alias. If we target multiple indices, then we also need a template and write alias per index. The workaround requires us to set up these resources upfront by running filebeat setup per index. All in all, it is good practice to separate setup from ingesting data, as this allows us to excert much better control on privileges. The solution to this requires us to split the configuration file into two or more files.

Filebeat can load multiple configuration files overwriting existing settings via -c <next file> or directly overwriting settings from CLI via -E. We will make use of this feature here.

Let's prepare the configuration file for ingesting data first (filebeat.yml):

...
# we tell filebeat to not do any setup when ingesting data (we did this upfront)
setup.ilm.enabled: false
setup.template.enabled: false
...

output.elasticsearch:
  ...
  index: filebeat-7.6.0 # configure the index name to use the configured write alias

Next let's create a configuration file preparing our ILM setup (setup.yml):

# declare a variable to reduce repetition
basename: filebeat

# ILM setup
# ---------
setup.ilm.enabled
setup.ilm.rollover_alias: "${basename}-%{[agent-version]}" # will expand to filebeat-7.6.0
setup.ilm.policy_name: ${basename}
setup.ilm.check_exists: false
setup.ilm.overwrite: true 

# Template setup
# --------------

setup.template.name: "${basename}-%{[agent.version]}"
setup.template.pattern: "${basename}-%{[agent.version]}-*"

Now I can setup my filebeat index with write alias like this: filebeat setup -c filebeat.yml -c setup.yml.

Having basename as a variable I can use this command in a script to prepare multiple indices: filebeat setup -c filebeat.yml -c setup.yml -E basename=other. This will create an ILM policy, template, index, and write alias prefixed with other instead of filebeat.

For automation I can also create more configuration files with overwrites and call them like this:

filebeat -c filebeat.yml -c setup.yml -c myindex.yml

The myindex.yml file can overwrite any setup.ilm or setup.template setting.

urso avatar Feb 22 '20 09:02 urso

Thanks for the detail! There are a lot of details here for a user to ingest. I do plan to create a section about index management (it's part of the restructuring I'm doing to streamline the getting started guides). It's been in my backlog for quite awhile, but I'll be getting to it soon.

dedemorton avatar Feb 24 '20 20:02 dedemorton

Please update documentation about ILM configuration part. I've spent a whole working day to just setup this and it still didn't work as expected. Following current official document is really frustrated.

duclm2609 avatar Feb 29 '20 03:02 duclm2609

I'm still not sure how can I attach/configure index template with simple custom settings like refresh_interval, when it's being overwritten by ILM.

magicpotion avatar Mar 26 '20 07:03 magicpotion

@urso, thanks for the detailed explanation and code example, I am trying to understand the example you provided.

In setup.yml, can you help me understand why this rollover_alias will write to filebeat-7.6.0

setup.ilm.rollover_alias: "${basename}-%{[agent-version]}" # will expand to filebeat-7.6.0

And can we write the index name with variable in the filebeat.yml, or this index name doesn't matter as long as it's writable?

output.elasticsearch:
  index: filebeat-%{[agent-version]}

if I have two filebeats with different basename variables, will write alias filebeat-7.6.0 still work?

My last question is, do we need to run this like

filebeat setup -c filebeat.yml -c setup.yml
filebeat -c filebeat.yml -c setup.yml

tomqwu avatar May 13 '20 02:05 tomqwu

Hi @magicpotion, @tomqwu, I'd recommend to ask operational questions on discuss.elastic.co. On the discuss forum you will normally reach a wider audience of active users and us developers as well. The issue is already quite big with loads of details and I would like to keep discussions focused on the actual docs.

The alias acts as a pointer to the current index in Elasticsearch. The actual index in Elasticsearch will become something like filebeat-<version>-<creation-date>-000001. In case the ILM policy kicks in another index like filebeat-<version>-<creation-date>-000002 will be created. This means that Beats do not write into one index anymore, but a set of indices. These indices are managed by Elasticsearch, not Beats anymore. When Beats write to filebeat-7.6.0, then Elasticsearch will resolve the current index name automatically and forward the writes to the current index. You can think of the alias as a symlink that get's eventually updated. When sending to Elasticsearch the alias name can be used instead of the index name.

If I understand your setup correctly you want to run it like this:

filebeat setup -c filebeat.yml -c setup.yml
filebeat -c filebeat.yml

With filebeat.yml having ILM and template setup disabled. The setup.yml will overwrite the settings in filebeat.yml, as it is mentioned last.

if I have two filebeats with different basename variables, will write alias filebeat-7.6.0 still work?

Why do you have 2 basename variables? In my example I used basename as prefix for the template, ILM polocy and index names. This means that the basename should be the same for all Beats wanting to send to the same index.

urso avatar May 13 '20 11:05 urso

@sholokhov17 Can you please share your case on discuss.elastic.co. In the forums we have users and developers helping. Your config/goal is not immediately clear to me and I'd prefer if the issue, which discusses docs changes, does not turn into a list of support cases. Thanks.

urso avatar Aug 06 '20 13:08 urso

Why can't we get this bug/feature documented? I'm pretty sure that dozens of developers have now wasted hundreds of hours chasing their tails and losing their minds as to why a setting is silently dropped.

RudeDude avatar Nov 18 '21 17:11 RudeDude

Would be good if it were documented on https://www.elastic.co/guide/en/beats/filebeat/current/change-index-name.html

agxs avatar Feb 16 '22 15:02 agxs

From @urso's comment

For those looking to use custom index names with ILM enabled, see here: https://www.elastic.co/guide/en/beats/filebeat/current/ilm.html

Since the new version, it's now -> https://www.elastic.co/guide/en/beats/filebeat/7.17/ilm.html

gemmadlou avatar Sep 29 '22 09:09 gemmadlou

Pinging @elastic/obs-docs (Team:Docs)

elasticmachine avatar Nov 02 '22 19:11 elasticmachine

Edited the title to reflect the proposal that we add a new section that describes index management.

There are places in the documentation where we do try to clarify limitations, but they are easy to overlook and confusing when you can't see the whole picture.

dedemorton avatar Nov 02 '22 19:11 dedemorton

Hi there! It is possible, With the explanation posted by @urso in https://github.com/elastic/beats/issues/11866#issuecomment-589936554 to have ILM enabled and custom index name?

Also, It's possible to have similar setup with the ECK operator?

Thanks!!!

FranAguiar avatar Jan 20 '23 10:01 FranAguiar