elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Elasticsearch fails to start in Docker, when `elasticsearch.yml` is bind mount

Open jkakavas opened this issue 3 years ago • 16 comments

Elasticsearch Version

8.0.0

Installed Plugins

No response

Java Version

bundled

OS Version

N/A

Problem Description

Elasticsearch fails to start when elasticsearch.yml is bind mount to a file on the host with a "Device or resource busy' error. This was possibly introduced with the changes for the autoconfiguration of the security features and triggers when we attempt to write the configuration to the elasticsearch.yml file (AutoConfigureNode#fullyWriteFile)

Steps to Reproduce

docker run --name oh-noes-this-fails -p 9200:9200 -v /absolute/path/to/a/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0

or

docker run --name  oh-noes-this-fails-too -p 9200:9200 --mount type=bind,source=/absolute/path/to/a/elasticsearch.yml,target=/usr/share/elasticsearch/config/elasticsearch.yml -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0

fails with

Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.R0_9BZ4hRx-v8zK3F0U-Bw.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
	at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
	at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
	at java.base/java.nio.file.Files.move(Files.java:1432)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1136)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1148)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
	at org.elasticsearch.cli.Command.main(Command.java:77)
	at org.elasticsearch.xpack.security.cli.AutoConfigureNode.main(AutoConfigureNode.java:157)

Logs (if relevant)

No response

jkakavas avatar Mar 29 '22 17:03 jkakavas

Pinging @elastic/es-security (Team:Security)

elasticmachine avatar Mar 29 '22 17:03 elasticmachine

Clarification: From the stack trace, AutoConfigureNode CLI is experiencing the error, not Elasticsearch.

Startup: Container => /usr/local/bin/docker-entrypoint.sh => /usr/share/elasticsearch/bin/elasticsearch

Looking at /usr/share/elasticsearch/bin/elasticsearch, it seems like the variable ATTEMPT_SECURITY_AUTO_CONFIG=true triggers a call to AutoConfigureNode CLI before Elasticsearch. The stack trace is for AutoConfigureNode CLI, not Elasticsearch.

Excerpt of the AutoConfigure CLI command:

ES_MAIN_CLASS=org.elasticsearch.xpack.security.cli.AutoConfigureNode \
ES_ADDITIONAL_SOURCES="x-pack-env;x-pack-security-env" \
ES_ADDITIONAL_CLASSPATH_DIRECTORIES=lib/tools/security-cli \
bin/elasticsearch-cli "${ARG_LIST[@]}" <<<"$KEYSTORE_PASSWORD"

Excerpt of the Elasticsearch daemon command:

    "$JAVA" \
    "$XSHARE" \
    $ES_JAVA_OPTS \
    -Des.path.home="$ES_HOME" \
    -Des.path.conf="$ES_PATH_CONF" \
    -Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
    -Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
    -Des.bundled_jdk="$ES_BUNDLED_JDK" \
    -cp "$ES_CLASSPATH" \
    org.elasticsearch.bootstrap.Elasticsearch \
    "${ARG_LIST[@]}" \
    <<<"$KEYSTORE_PASSWORD" &

justincr-elastic avatar Mar 29 '22 18:03 justincr-elastic

Reproduce original issue by executing

> docker run --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v C:\Docker\elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 
Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.Occjcc_mS06vpoRLwlpUwA.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
        at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
        at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
        at java.base/java.nio.file.Files.move(Files.java:1432)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1136)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1148)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
        at org.elasticsearch.cli.Command.main(Command.java:77)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.main(AutoConfigureNode.java:157)

Extract interesting files from container (Prerequisite: All C:\Docker to file sharing accept list)

> docker run --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v "C:\Docker":/mnt/local --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/config/elasticsearch.yml /mnt/local/elasticsearch.yml
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/config/elasticsearch.yml /mnt/local/elasticsearch2.yml
elasticsearch@9d37e1eb7777:~$ cp /usr/local/bin/docker-entrypoint.sh               /mnt/local/docker-entrypoint.sh
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/bin/elasticsearch        /mnt/local/elasticsearch

Start in bash as root user, switch to elasticsearch, manually run docker-entrypoint.sh to reproduce the original error

> docker run -u root --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v C:\Docker\elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v C:\Docker\elasticsearch2.yml:/usr/share/elasticsearch/config/elasticsearch2.yml --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash

root@62b736fca663:/usr/share/elasticsearch# ls -l /usr/share/elasticsearch/config/elasticsearch*.yml
-rw-rw-r-- 1 root root 1042 Feb  3 16:47 /usr/share/elasticsearch/config/elasticsearch-plugins.example.yml
-rwxr-xr-x 1 root root   53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch.yml
-rwxr-xr-x 1 root root   53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch2.yml

root@62b736fca663:/usr/share/elasticsearch# df -a | grep elasticsearch
grpcfuse       998896636 190624520 808272116  20% /usr/share/elasticsearch/config/elasticsearch.yml
grpcfuse       998896636 190624520 808272116  20% /usr/share/elasticsearch/config/elasticsearch2.yml

root@62b736fca663:/usr/share/elasticsearch# su - elasticsearch

elasticsearch@62b736fca663:~$ /usr/local/bin/docker-entrypoint.sh
Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.JrtBhUSPQ4eNKgiJ3atKQQ.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
        at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
        at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
        at java.base/java.nio.file.Files.move(Files.java:1432)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1136)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1148)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
        at org.elasticsearch.cli.Command.main(Command.java:77)
        at org.elasticsearch.xpack.security.cli.AutoConfigureNode.main(AutoConfigureNode.java:157)

elasticsearch@62b736fca663:~$ ls -l /usr/share/elasticsearch/config/elasticsearch*.yml
-rw-rw-r-- 1 root          root          1042 Feb  3 16:47 /usr/share/elasticsearch/config/elasticsearch-plugins.example.yml
-rwxr-xr-x 1 elasticsearch elasticsearch   53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch.yml
-rwxr-xr-x 1 root          root            53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch2.yml

justincr-elastic avatar Mar 29 '22 20:03 justincr-elastic

Check elasticsearch.yml ownership and permissions before and after manually running docker-entrypoint.sh.

>docker run -u root --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash

root@40b71bc4c3ae:/usr/share/elasticsearch# ls -l /usr/share/elasticsearch/config/elasticsearch.yml
-rw-rw-r-- 1 root root 53 Feb  3 22:53 /usr/share/elasticsearch/config/elasticsearch.yml

root@40b71bc4c3ae:/usr/share/elasticsearch# su - elasticsearch

elasticsearch@40b71bc4c3ae:~$ /usr/local/bin/docker-entrypoint.sh > /dev/null 2> /dev/null &
[1] 18

elasticsearch@40b71bc4c3ae:~$ ls -l /usr/share/elasticsearch/config/elasticsearch.yml
-rw-rw-r-- 1 elasticsearch elasticsearch 1106 Mar 29 20:47 /usr/share/elasticsearch/config/elasticsearch.yml

justincr-elastic avatar Mar 29 '22 20:03 justincr-elastic

If the operator does not mount elasticsearch.yml, I assume they want elasticsearch.yml autoconfiguration. If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.

From looking at the startup scripts, I don't see an option to skip autoconfiguration. The only way seems to be if ENROLLMENT_TOKEN is set.

  • /usr/local/bin/docker-entrypoint.sh looks for it and calls /usr/share/elasticsearch/bin/elasticsearch --enrollment-token $ENROLLMENT_TOKEN .
  • /usr/share/elasticsearch/bin/elasticsearch only skips autoconfiguration (i.e. ATTEMPT_SECURITY_AUTO_CONFIG=false) if one of these parameters are present: --enrollment-token, --help, -h, --version, or -v

justincr-elastic avatar Mar 29 '22 21:03 justincr-elastic

Note that in addition to elasticsearch, kibana actually overwrites the configuration file to write content. So in fact, should the initialization file be separated from the actual configuration file like the .conf.d file, such as adding a concept of elasticsearch-d.yml to be responsible for initialization?

linghengqian avatar Mar 30 '22 07:03 linghengqian

If the operator does not mount elasticsearch.yml, I assume they want elasticsearch.yml autoconfiguration. If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.

If you're proposing this should be the logic we use in the auto-configuration, I concur.

Should the same logic extend to the config directory?

albertzaharovits avatar Mar 30 '22 18:03 albertzaharovits

If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.

I'd just like to add that this is not always the case. Whether, we should accept that as a limitation and work with this is another topic ( which I probably also agree with ) but for instance, on both cases this was reported in the forums, the users wanted to set a specific value (i.e. network.host to affect the SANs of the HTTP certificate ) but take advantage of the security features

jkakavas avatar Mar 31 '22 02:03 jkakavas

We briefly discussed this today in our weekly sync. There was consensus that mounting only the elasticsearch.yml file, but leaving the rest of the config directory on the docker container, is not a configuration that works well with Security auto-configuration (primarily because persisting only the generated yml file, without the associated keystore and certs, is not useful for subsequent container runs).

I have taken an action item to investigate what is the consistent way to react to such a configuration, from starting without security auto-conf, or not starting at all. I'll assign this to me.

albertzaharovits avatar Mar 31 '22 20:03 albertzaharovits

Got this issue with version 8.1.2.

Having a specific configuration file elasticsearch.yml is simplier to handle than defining all the env variables in the docker-compose.yaml that can be very verbose when using multiple docker services.

mister-good-deal avatar Apr 06 '22 09:04 mister-good-deal

This is due to the container using elasticsearch:elasticsearch as the user. Docker containers are intended to run everything via root:root.

All you need to do is set the ownerID and groupID of the directories being mounted to 1000:1000

ex:


- name: Create elk directory if it does not exist
  ansible.builtin.file:
    path: /opt/elk/{{ item.name }}
    state: directory
    mode: '0755'
    owner: "{{ item.oid }}"
    group: "{{ item.gid }}"
  with_items:
    - { name: "elasticsearch/config", oid: 1000, gid: 1000}
    - { name: "elasticsearch/data", oid: 1000, gid: 1000}
    - { name: "kibana/config", oid: 1000, gid: 1000}
    - { name: "kibana/data", oid: 1000, gid: 1000}
  become: yes

greenscar avatar Jun 30 '22 16:06 greenscar

Hi all, There is some progress with this bug ? got this issue with version 8.3.2. Setting the ownerID and groupID of the mounted directories to 1000:1000 not resolving to issue.

Milana-Gelman-PX avatar Jul 12 '22 08:07 Milana-Gelman-PX

I am using env var, instead of mounting elasticsearch.yml. For example, I add ELASTICSEARCH_FS_SNAPSHOT_REPO_PATH=/mnt/backup in order to setup snapshot repo.

chance2021 avatar Jul 15 '22 01:07 chance2021

I have taken an action item to investigate what is the consistent way to react to such a configuration, from starting without security auto-conf, or not starting at all.

@albertzaharovits did you get anywhere with this?

My feeling is that we should do something like (if we determine auto-configuration is needed)

  1. Try to write a temporary file to the config directory. If that fails, then we know we won't be successful with auto-configuration, and we should skip it
  2. Check whether that temporary file has the same mount point as elasticsearch.yml and elasticsearch.keystore if not, then we can assume that auto-configuration will do the wrong thing (that is, it would write files to 2 or more different mount points, leading to one or both being orphaned). In that case we should cleanup the temp file and skip the rest of auto configuration. We can probably just check that the output from findmnt --noheadings --output TARGET --target ${file} is the same for all 3 files (temp, yml, keystore)
  3. Otherwise, remove the temp file and proceed with auto-configuration

We should talk about whether to do that for all packaging types, or just for docker.

tvernum avatar Jul 15 '22 01:07 tvernum

Bumping this as this causes issues when trying to run the elasticsearch container as a rootless container using systemd.

I have tried to copy some files (the certs and .yml and .keystore) and bind mount them, and then adding -e ATTEMPT_SECURITY_AUTO_CONFIG=false to podman run, but I could not get the correct enrollment token.

I would very much like to have all the security bells and whistles autoconfigured for me + persistent storage :)

psychogun avatar Aug 08 '22 21:08 psychogun

I think I got it working; rootless containers running at boot without having the user having to log in. Here is a little write up. Hopefully you guys can make this a bit easier!

cat /etc/*-release
Rocky Linux release 8.6 (Green Obsidian)
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"

Initial start to generate some files:

podman run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v /podman/elasticsearch/data:/usr/share/elasticsearch/data:Z -it docker.elastic.co/elasticsearch/elasticsearch:8.3.3

Ctrl + C to quit

cd ~/podman/elasticsearch/config
podman cp es01:/usr/share/elasticsearch/config/elasticsearch.yml .
podman cp es01:/usr/share/elasticsearch/config/elasticsearch.keystore .
mkdir ~/podman/elasticsearch/config/certs
cd certs
podman cp es01:/usr/share/elasticsearch/config/certs/http.p12 .
podman cp es01:/usr/share/elasticsearch/config/certs/transport.p12 .
podman cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .


podman stop es01
podman rm es01


rm -rf /podman/elasticsearch/data/*

Let us bind mount:

podman run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e ATTEMPT_SECURITY_AUTO_CONFIG=false -v ~/podman/elasticsearch/config/certs/http.p12:/usr/share/elasticsearch/config/certs/http.p12:Z -v ~/podman/elasticsearch/config/certs/transport.p12:/usr/share/elasticsearch/config/certs/transport.p12:Z -v ~/podman/elasticsearch/config/certs/http_ca.crt:/usr/share/elasticsearch/config/certs/http_ca.crt:Z -v ~/podman/elasticsearch/config/elasticsearch.keystore:/usr/share/elasticsearch/config/elasticsearch.keystore:Z -v ~/podman/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:Z -v /podman/elasticsearch/data:/usr/share/elasticsearch/data:Z -dt docker.elastic.co/elasticsearch/elasticsearch:8.3.3

Let us get the enrollment token (a lot of errors here, but it spits out the code in the end):

podman exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana

Let us start Kibana and use our enrollment procedure visiting the website:5601 and grabbing the code from terminal:

podman run --name kib-01 --net elastic -p 5601:5601 -v ~/podman/kibana/data/:/usr/share/kibana/data/:Z docker.elastic.co/kibana/kibana:8.3.3

CTLR + C to stop kibana.

But, let us start it again so we can grab the kibana.yml configuration file:

podman start kib-01

mkdir ~/podman/kibana/config
cd ~/podman/kibana/config
podman cp kib-01:/usr/share/kibana/config/kibana.yml . 

Stop it, using podman stop kib-01.

Let us remove this file:

rm ~/podman/kibana/data/uuid

This will be our final run command for Kibana:

podman run --name kib-01 --net elastic -p 5601:5601 -v ~/podman/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml:Z -v ~/podman/kibana/data/:/usr/share/kibana/data:Z -e SERVER_PUBLICBASEURL=http://192.168.10.44 -dt docker.elastic.co/kibana/kibana:8.3.3

Now I have working persistent configuration and I can generate systemd unit files (??). Let us also stop the containers and remove them:

cd ~/.config/systemd/user
podman generate systemd --new --files --name es01
podman generate systemd --new --files --name kib-01 

podman stop kib-01
podman rm kib-01
podman stop es01
podman rm es01

Using systemctl for now:

systemctl --user enable --now container-es01.service
systemctl --user enable --now container-kib-01.service

But hey, what about passwords? This throws a lot of errors; although in the end it works and gives me a valid password for the elastic user:

podman exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic

I can reboot the host computer and everything works without having to log in (loginctl enable-linger).

The transport is now SSL encrypted, I have all the bells and whistles offered from the auto-configuration?

psychogun avatar Aug 09 '22 01:08 psychogun

I found that if I explicitly set xpack.security.enabled: true and bind mount a keystore that has a bootstrap.password set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.

Here is very simple single node cluster with a bind mounted elasticsearch.yml and keystore : https://github.com/jakelandis/es-docker-simple

jakelandis avatar Aug 25 '22 20:08 jakelandis

I found that if I explicitly set xpack.security.enabled: true and bind mount a keystore that has a bootstrap.password set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.

Here is very simple single node cluster with a bind mounted elasticsearch.yml and keystore : https://github.com/jakelandis/es-docker-simple

setting xpack.security.enabled: true in the custom elasticsearch.yaml fixed it for me , now its get mounted ,

martijnvdp avatar Aug 27 '22 20:08 martijnvdp

I found that if I explicitly set xpack.security.enabled: true and bind mount a keystore that has a bootstrap.password set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.

This is expected because enabling security explicitly makes the startup process skip security auto-configuration. The original error was thrown during security auto-configuration. Since it is skipped, the error no longer happens. But I believe the intention for this issue is whether we could either (1) detect the original bind mount situation and automatically skip auto configuration (IIUC, this is our preference) or (2) have auto configuration work if the the bind mount meets certain requirements.

ywangd avatar Aug 29 '22 02:08 ywangd

So.. no fix for this yet?

Anyways, try my workaround...

On your docker command line: -v /absolute/path/to/a/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml Make sure the volume file "/absolute/path/to/a/elasticsearch.yml" exists and is writable.

Also, elasticsearch.yml should not be empty.

My example configuration:

cluster.name: "docker-cluster" network.host: 0.0.0.0 xpack.license.self_generated.type: trial xpack.security.enabled: true

z3r0101 avatar Sep 08 '22 06:09 z3r0101

Due to i only used for localhost,

version: '3.9'
services:
    elasticsearch:
        container_name: elasticsearch
        image: elasticsearch:8.5.2
        environment:
            - TZ=Etc/GMT-8
            - discovery.type=single-node
            - ES_JAVA_OPTS=-Xmx256M
        deploy:
            restart_policy:
                condition: on-failure
                delay: 5s
                max_attempts: 3
                window: 5s
            resources:
              limits:
              cpu: 1
              memory: 2G
        ulimits:
            nofile:
                soft: 65535
                hard: 65535
        sysctls:
            - net.ipv6.conf.all.disable_ipv6=1
            - net.ipv6.conf.default.disable_ipv6=1
            - net.ipv6.conf.lo.disable_ipv6=1
            - net.ipv4.conf.all.rp_filter=0
            - net.ipv4.conf.default.rp_filter=0
            - net.ipv4.conf.default.arp_announce=2
            - net.ipv4.conf.lo.arp_announce=2
            - net.ipv4.conf.all.arp_announce=2
            - net.ipv4.tcp_max_tw_buckets=5000
            - net.ipv4.tcp_syncookies=1
            - net.ipv4.tcp_max_syn_backlog=2048
            - net.core.somaxconn=51200
            - net.ipv4.tcp_synack_retries=2
            - net.ipv4.tcp_fastopen=3
        dns:
            - 223.5.5.5
            - 223.6.6.6
            - 1.1.1.1
            - 1.0.0.1
            - 8.8.8.8
            - 8.8.4.4
        ports:
            -   target: 9200
                published: 9200
                protocol: tcp
                mode: host
        volumes:
            -   type: bind
                source: /www/server/elasticsearch/config/elasticsearch.yml
                target: /usr/share/elasticsearch/config/elasticsearch.yml
            -   type: bind
                source: /www/server/elasticsearch/data
                target: /usr/share/elasticsearch/data
            -   type: bind
                source: /www/server/elasticsearch/plugins
                target: /usr/share/elasticsearch/plugins
        healthcheck:
            disable: true
networks:
    default:
        name: podman
        external: true

then the config

cluster.name: docker-cluster
network.host: 0.0.0.0
xpack.security.enabled: false

works

but the config

cluster.name: docker-cluster
network.host: 0.0.0.0

got this error.

Then i think https://github.com/elastic/elasticsearch/issues/85463#issuecomment-1229264396 Is correct!

jyxjjj avatar Dec 08 '22 09:12 jyxjjj

god, who broke this? on 8.6.1 we aren't getting these errors.

  • https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_configuration_files_must_be_readable_by_the_elasticsearch_user

elasticsearch-1 | Could not rename log file 'logs/gc.log' to 'logs/gc.log.03' (Permission denied). elasticsearch-1 | {"@timestamp":"2023-06-06T09:41:21.503Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"9a8a73e358ed","elasticsearch.cluster.name":"elasticsearch","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:291)\n\tat [email protected]/org.elasticsearch.node.Node.(Node.java:483)\n\tat [email protected]/org.elasticsearch.node.Node.(Node.java:327)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch$2.(Elasticsearch.java:216)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:216)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.(NodeEnvironment.java:236)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.(NodeEnvironment.java:204)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:283)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:833)\n\tat [email protected]/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat [email protected]/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)\n\tat [email protected]/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.(NodeEnvironment.java:229)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat [email protected]/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}

joshxyzhimself avatar Jun 06 '23 09:06 joshxyzhimself

Running into the same problems. Was working fine the whole time with docker-compose, and suddenly when having killed the container and restarting it, I'm getting these errors:

bm-elasticsearch-poc-elastic-1  | {"@timestamp":"2023-07-26T08:04:29.450Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsea
rch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"elastic-0","elasticsearch.cluster.name":"biz","error.type":"java.lang.IllegalStateException","error.mes
sage":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: faile
d to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat [email protected]/org.elasticsearch.env.NodeEnvironme
nt.<init>(NodeEnvironment.java:291)\n\tat [email protected]/org.elasticsearch.node.Node.<init>(Node.java:480)\n\tat [email protected]/org.elasticsearch.node.Node.<init>(Node.java:324)\n\tat org.elastics
[email protected]/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:216)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:216)\n\tat org.elasticsea
[email protected]/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat [email protected]/org.elasticsearc
h.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:236)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:204)\n\tat [email protected]/org.elasti
csearch.env.NodeEnvironment.<init>(NodeEnvironment.java:283)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(Un
ixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath
.toRealPath(UnixPath.java:833)\n\tat [email protected]/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat [email protected]/org.apache.lucene.store.FSLockFactory.obt
ainLock(FSLockFactory.java:43)\n\tat [email protected]/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>
(NodeEnvironment.java:229)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\ta
t java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByt
eChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat [email protected]/org.apache.luce
ne.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}

This is the docker-compose:

version: '2.2'
services:
  elastic:
    build:
      context: ./
      dockerfile: docker/elasticsearch/Dockerfile
    privileged: true
    environment:
      - cluster.name=biz
      - node.name=elastic-0
      - xpack.security.enabled=true
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    mem_limit: 4g
    cap_add:
      - IPC_LOCK
    volumes:
      - ./docker/_data/elasticsearch:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"
    healthcheck:
      test: ["CMD", "curl","-s" ,"-f", "http://localhost:9200/_cat/health"]
      retries: 10
    networks:
      - biz 

  kibana:
    image: docker.elastic.co/kibana/kibana:8.7.1
    container_name: kibana
    privileged: true
    ports:
      - "5601:5601"
    healthcheck:
      test: ["CMD", "curl", "-s", "-f", "http://localhost:5601/"]
      retries: 10
    depends_on:
      elastic:
        condition: service_healthy
    environment:
      - "ELASTICSEARCH_HOSTS=http://elastic:9200"
    networks:
      - biz

  app:
    build:
      context: .
      dockerfile: docker/app/Dockerfile
      args:
        - WITH_XDEBUG=true
    environment:
      - DEBUG=true
      - PHP_IDE_CONFIG=serverName=app
      # - XDEBUG_CONFIG=remote_host=172.32.0.1 remote_port=9001
    ports:
      - '80:80'
    volumes:
      - './:/var/www/html'
    networks:
      - biz

networks:
  biz:
    name: biz
    driver: bridge

Nothing has changed whatsoever, and this was working just fine. Even rebuilding the images doesn't solve the issue. No idea why it can't create the desired lock file, even though I see locally my _data/elasticsearch folder being created from the running container.

coding-red-panda avatar Jul 26 '23 08:07 coding-red-panda

facing the same issue,has anyone got a solution regarding this ?

zakhaev26 avatar Mar 27 '24 12:03 zakhaev26

facing the same issue,has anyone got a solution regarding this ?

  • The easiest way to work around this issue, originally reported by me at https://discuss.elastic.co/t/300981 , is to follow the instructions in Ioannis Kakavas's reply to the thread.

  • Two years ago, I didn’t expect this issue to persist for a long time.

linghengqian avatar Mar 27 '24 13:03 linghengqian

where can i find a docker-compose and it's related config files that actually works?i followed the one present at the elasticsearch's offiicial installation guide but i get logs like :

elasticsearch_container  | {"@timestamp":"2024-03-27T13:06:55.893Z", "log.level": "WARN",  "data_stream.dataset":"deprecation.elasticsearch","data_stream.namespace":"default","data_stream.type":"logs","elasticsearch.event.category":"settings","event.code":"xpack.monitoring.collection.enabled","message":"[xpack.monitoring.collection.enabled] setting was deprecated in Elasticsearch and will be removed in a future release." , "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"deprecation.elasticsearch","process.thread.name":"main","log.logger":"org.elasticsearch.deprecation.common.settings.Settings","elasticsearch.node.name":"12f9b075322d","elasticsearch.cluster.name":"docker-cluster"}
elasticsearch_container  | {"@timestamp":"2024-03-27T13:06:55.910Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"12f9b075322d","elasticsearch.cluster.name":"docker-cluster","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:294)\n\tat [email protected]/org.elasticsearch.node.Node.<init>(Node.java:499)\n\tat [email protected]/org.elasticsearch.node.Node.<init>(Node.java:344)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:236)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:236)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:239)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:206)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:286)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:834)\n\tat [email protected]/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat [email protected]/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)\n\tat [email protected]/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:232)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat [email protected]/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}
elasticsearch_container  | ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/docker-cluster.log
elasticsearch_container  | 
elasticsearch_container  | 
elasticsearch_container  | ERROR: Elasticsearch exited unexpectedly, with exit code 1

zakhaev26 avatar Mar 27 '24 13:03 zakhaev26