cp-docker-images icon indicating copy to clipboard operation
cp-docker-images copied to clipboard

Kafka Does not Start Due to Logs Failure

Open pjkaufman opened this issue 6 years ago • 9 comments

I am using Windows and trying to start up several docker containers. Originally the startup worked and kafka was running, but recently it ran into an error and shut down and will not start up again. The error message is "ERROR Shutdown broker because all log dirs in /var/lib/kafka/data have failed (kafka.log.LogManager)".

Logs File: logs.txt

pjkaufman avatar Aug 07 '18 15:08 pjkaufman

Hi @pjkaufman Kindly provide steps to reproduce (docker version, commands to run, etc) Thank you

gAmUssA avatar Aug 21 '18 21:08 gAmUssA

@gAmUssA I know how to replicate it on a Windows 10 environment with docker.

  1. Run start single node kafka & zookeeper cluster with this configuration: https://hastebin.com/jiluhivoto.http
  2. Create a topic and produce some messages to it
  3. I used Kafka HQ to delete the topic afterwards (available on localhost:8080 with this docker compose config), but probably you can also use the Kafka CLI.

Because I am working on a lag exporter I know, that the broker sends tombstone messages on the __consumer_offsets topic, but shortly after the broker is down and the last log messsages are those:

[2019-03-30 09:09:39,044] INFO Stopping serving logs in dir /var/lib/kafka/data (kafka.log.LogManager)
[2019-03-30 09:09:39,051] ERROR Shutdown broker because all log dirs in /var/lib/kafka/data have failed (kafka.log.LogManager)

Docker version: Docker Engine - Community (18.09.2)

weeco avatar Mar 30 '19 09:03 weeco

Stop the zookeeper Delete all the logs from zookeeper and kafka It should work Note:all the previously created topics will be lost

Ankitkansal2510 avatar May 04 '19 20:05 Ankitkansal2510

Stop the zookeeper Delete all the logs from zookeeper and kafka It should work Note:all the previously created topics will be lost

This cannot be a solution, this is just a workaround for dev environments.

This issue is related to Log Retention Policy, I see the same issue, it seems that this just happens on windows, I performed the same test on Linux, and it is working on Linux.

Steps to reproduce:

  1. Edit Kafka config file(server.properties), and change the configuration log.retention.check.interval.ms=60000
  2. Run start single node kafka & zookeeper;
  3. Create a topic and produce some messages to it;
  4. Change the retention duration, bin\windows\kafka-topics.bat --zookeeper localhost:2181 --alter --topic standard-json --config retention.ms=1000
  5. After one minute stop and start your producer from the beginning, it is expected the messages be cleared, however Kafka breaks because all logs failed.

but you can find this error message,

Suppressed: java.nio.file.FileSystemException: C:\tmp\kafka-logs\standard-json-0\00000000000000000004.timeindex -> C:\tmp\kafka-logs\standard-json-0\00000000000000000004.timeindex.deleted: The process cannot access the file because it is being used by another process.

michierus avatar May 27 '19 19:05 michierus

Hi @michierus,

as you mentioned:

This issue is related to Log Retention Policy

Were you able to solve this issue? Or is there any workaround other than deleting the logs directory (default to /tmp/kafka-logs)?

faisal6621 avatar Feb 19 '20 10:02 faisal6621

@michierus / @faisal6621 I am also facing into same issue, did we have any solution to be followed to fix this ?

mighty-raj avatar Jun 10 '20 16:06 mighty-raj

Hi @mighty-raj ,

unfortunately on Windows I could not found any solution.

the current workaround that I have lost the data,

  1. Stop the zookeeper
  2. Delete all the logs from zookeeper and kafka It should work but all the previously created topics will be lost

So as I am just performing a master degree research I just moved the solution to Linux.

Maybe we need to check the last Kafka version.

michierus avatar Jun 10 '20 16:06 michierus

It is been more than one year since the last comment on this topic. I am getting the same issue. Did anyone found a solution instead of the workaround to delete logs in both UKA and Zookeeper.

vpachipenta avatar Jul 02 '21 20:07 vpachipenta

As all are saying if we delete logs and restart the services it will work. However the question is how to resolve the issue without restarting the services, is it possible to make some configurational changes in the properties file to automatically refresh the logs to a new folder. For example when Kafka service goes down due to logs failure then it should be pointing to the new logs dir but the service should not go down

rkafk avatar Nov 10 '21 07:11 rkafk