ragflow [Question]: Elasticsearch Docker Container aways Starting in RAGFlow Deployment

Self Checks

[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (Language Policy).
[x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
[x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

Hello RAGFlow Team,

I'm encountering a persistent issue with the Elasticsearch Docker container failing to start during RAGFlow deployment. Here are the complete details:

Environment & Deployment Information: RAGFlow Version: 0.22.1

Deployment Method:

bash cd ragflow/docker docker compose -f docker-compose.yml up -d Operating System: UOS (as shown in the path)

Pre-configured Settings:

vm.max_map_count = 262144 (already set)

MEM_LIMIT=16G in docker/.env file

Current Container Status: Container ID: c029922cffe7

Image: elasticsearch:8.11.3

Status: Up 16 seconds (health: starting)

Port Mapping: 1200->9200

Container Name: docker-es01-1

Error Details: The Elasticsearch container starts but quickly fails with critical permission errors. Here are the relevant logs from docker logs -f --tail 20 docker-es01-1:

text {"@timestamp":"2025-11-27T08:23:07.996Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"es01","elasticsearch.cluster.name":"docker-cluster","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:294)\n\tat [email protected]/org.elasticsearch.node.Node.(Node.java:499)\n\tat [email protected]/org.elasticsearch.node.Node.(Node.java:344)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch$2.(Elasticsearch.java:236)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:236)\n\tat [email protected]/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.(NodeEnvironment.java:239)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment$NodeLock.(NodeEnvironment.java:206)\n\tat [email protected]/org.elasticsearch.env.NodeEnvironment.(NodeEnvironment.java:286)\n\t... 5 more\nCaused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.j Key Error Summary: Primary Error: failed to obtain node locks on /usr/share/elasticsearch/data

Root Cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock

Issue Type: Data directory permission/access denial

What I've Already Verified: ✅ Kernel parameter vm.max_map_count is properly set to 262144

✅ Memory limit configured to 16GB in environment file

✅ Using the standard RAGFlow deployment procedure

✅ No apparent port conflicts (using port 1200 for ES)

The container appears to start briefly but then fails due to the data directory permission issue. I would appreciate guidance on how to resolve this data locking problem without compromising the RAGFlow setup.

Thank you for your assistance.

Nov 27 '25 08:11 TiHongW

Hi @TiHongW! I'm Dosu and I’m helping the ragflow team.

Your Elasticsearch container is failing due to a permissions issue on the Docker-managed data volume mounted at /usr/share/elasticsearch/data. The docker-compose setup for RAGFlow uses a local Docker volume (esdata01) for this directory, not a direct host path, and does not specify custom user/group settings—so it relies on Docker and the official Elasticsearch image to manage permissions by default. There are no RAGFlow-specific scripts or recent changes affecting this behavior, so the issue is likely environmental rather than project-specific.

Here are some steps you can try:

Check the ownership and permissions of the Docker volume. Run docker volume inspect esdata01 to find the mountpoint, then verify the permissions on the host.
If you suspect the volume is corrupted or has incorrect permissions, you can remove and recreate it (note: this will delete any existing Elasticsearch data):
```
docker compose down
docker volume rm ragflow_esdata01
docker compose up -d
```
If your host uses SELinux or AppArmor, try disabling them temporarily or adjusting their policies, as they can block container access to volumes even if standard permissions look correct.
Make sure no other Elasticsearch containers are running and trying to use the same data volume.

This issue is not caused by RAGFlow-specific configuration, but rather by Docker or host environment factors. For reference, you can see how the volume is defined and mounted in the compose file here.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Nov 27 '25 08:11 dosubot[bot]

Hi @dosu ! Thank you for the timely response and clear troubleshooting steps! I followed your advice to check and recreate the Docker volume, and the deployment issue with 0.22.1 has been fixed successfully. Your support is really helpful.

I currently have RagFlow 0.19.1 running in production, and plan to upgrade it to 0.22.1 soon. To avoid encountering similar Elasticsearch volume permission problems or other compatibility issues during the upgrade, could you provide official upgrade guidelines (e.g., pre-upgrade checks, data backup steps, or configuration adjustments)? Any tips would be greatly appreciated!

Nov 27 '25 09:11 TiHongW

Solution: Adjust the ownership and permissions of the Elasticsearch data directory. For example, if running Elasticsearch in a Docker container, the data directory on the host machine might need to be owned by the user ID that Elasticsearch runs as inside the container (often user ID 1000).

    sudo chown -R 1000:1000 /usr/share/elasticsearch/data
    # Or, if running as a specific user/group:
    sudo chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data

Another Elasticsearch Instance Running: A previous instance of Elasticsearch might still be running and holding the lock on the node.lock file, preventing a new instance from starting.

    ps aux | grep elasticsearch
    kill -9 <PID_OF_RUNNING_ELASTICSEARCH>
    rm /usr/share/elasticsearch/data/node.lock

Nov 28 '25 04:11 KevinHuSh