amazon-kinesis-agent
amazon-kinesis-agent copied to clipboard
Agent failed to start from previous checkpoint after log file rotation
agent was unable to start from previous checkpoint after log file rotation. Error: “Expected the service to be RUNNING, but the service has FAILED”.
agent.json
{
"cloudwatch.emitMetrics": true,
"firehose.endpoint": "firehose.us-east-1.amazonaws.com",
"awsAccessKeyId": "",
"awsSecretAccessKey": "",
"maxSendingThreads": 15,
"flows": [
{
"filePattern": "/opt/app/ums/current/log/dev.log",
"deliveryStream": "firehose-deliverystream-m1-kinesispoc",
"dataProcessingOptions": [
{
"optionName": "LOGTOJSON",
"logFormat": "COMMONAPACHELOG",
"matchPattern": "^\\[(.*)\\] \\[([\\d.]+)\\] \\[(.*?)\\]\\s?(.*)",
"customFieldNames": ["uuid", "ip", "uri", "log_message"]
}
]
}
]
}
Error:
Error:
2017-07-05 09:14:12.346-0600 localhost (main) com.amazon.kinesis.streaming.agent.Agent [INFO] Reading configuration from file: /etc/aws-kinesis/agent.json
2017-07-05 09:14:12.436-0600 localhost (main) com.amazon.kinesis.streaming.agent.processing.processors.AgentDataConverterChain [DEBUG] Using data converter chain: AgentDataConverterChain [com.amazon.kinesis.streaming.agent.processing.processors.LogToJSONDataConverter@6c2cba0a ]
2017-07-05 09:14:12.455-0600 localhost (main) com.amazon.kinesis.streaming.agent.Agent [INFO] null: Agent will use up to 12 threads for sending data.
2017-07-05 09:14:12.474-0600 localhost (main) com.amazon.kinesis.streaming.agent.tailing.checkpoints.SQLiteFileCheckpointStore [DEBUG] Connecting to database /var/run/aws-kinesis-agent/checkpoints...
2017-07-05 09:14:12.710-0600 localhost (main) com.amazon.kinesis.streaming.agent.tailing.checkpoints.SQLiteFileCheckpointStore [INFO] Deleted 0 old checkpoints.
2017-07-05 09:14:12.743-0600 localhost (Agent STARTING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Starting up...
2017-07-05 09:14:12.758-0600 localhost (Agent STARTING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Starting tailer for file fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log
2017-07-05 09:14:12.789-0600 localhost (FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log]) com.amazon.kinesis.streaming.agent.tailing.FileTailer [DEBUG] FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log]: Starting up...
2017-07-05 09:14:12.791-0600 localhost (FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log]) com.amazon.kinesis.streaming.agent.tailing.FileTailer [DEBUG] FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log]: Found checkpoint FileCheckpoint(file=TrackedFile(id=(dev=ca51,ino=791364), path=/opt/app/ums/current/log/dev.log, lastModifiedTime=1499123740000, size=14427187), offset=14428345, fileId=(dev=ca51,ino=791364), flowId=fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log).
2017-07-05 09:14:12.825-0600 localhost (Agent STARTING) com.amazon.kinesis.streaming.agent.Agent [ERROR] Unhandled exception during startup.
java.lang.IllegalStateException: Expected the service to be RUNNING, but the service has FAILED
at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:285)
at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:229)
at com.google.common.util.concurrent.AbstractExecutionThreadService.awaitRunning(AbstractExecutionThreadService.java:197)
at com.amazon.kinesis.streaming.agent.Agent.startUp(Unknown Source)
at com.google.common.util.concurrent.AbstractIdleService$2$1.run(AbstractIdleService.java:54)
at com.google.common.util.concurrent.Callables$3.run(Callables.java:95)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.MoreExecutors$2.execute(MoreExecutors.java:849)
at com.google.common.util.concurrent.AbstractIdleService$2.doStart(AbstractIdleService.java:50)
at com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:174)
at com.google.common.util.concurrent.AbstractIdleService.startAsync(AbstractIdleService.java:135)
at com.amazon.kinesis.streaming.agent.Agent.main(Unknown Source)
Caused by: java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:158)
at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.findCurrentOpenFileAfterTruncate(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.updateCurrentFile(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.initialize(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.FileTailer.initialize(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.FileTailer.startUp(Unknown Source)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:55)
at com.google.common.util.concurrent.Callables$3.run(Callables.java:95)
at java.lang.Thread.run(Thread.java:745)
2017-07-05 09:14:12.829-0600 localhost (FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log]) com.amazon.kinesis.streaming.agent.Agent [ERROR] FATAL: Thread FileTailer[fh:firehose-deliverystream-m1-kinesispoc:/opt/app/ums/current/log/dev.log] threw an unrecoverable error. Aborting application
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:158)
at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.findCurrentOpenFileAfterTruncate(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.updateCurrentFile(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.initialize(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.FileTailer.initialize(Unknown Source)
at com.amazon.kinesis.streaming.agent.tailing.FileTailer.startUp(Unknown Source)
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:55)
at com.google.common.util.concurrent.Callables$3.run(Callables.java:95)
at java.lang.Thread.run(Thread.java:745)
I have the same problem after the Log File rotation:
2018-08-29 07:35:29.256+0000 ip-10-4-1-47 (Agent STARTING) com.amazon.kinesis.streaming.agent.Agent [ERROR] Unhandled exception during startup. java.lang.IllegalStateException: Expected the service to be RUNNING, but the service has FAILED at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:285) at com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:229) at com.google.common.util.concurrent.AbstractExecutionThreadService.awaitRunning(AbstractExecutionThreadService.java:197) at com.amazon.kinesis.streaming.agent.Agent.startUp(Unknown Source) at com.google.common.util.concurrent.AbstractIdleService$2$1.run(AbstractIdleService.java:54) at com.google.common.util.concurrent.Callables$3.run(Callables.java:95) at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) at com.google.common.util.concurrent.MoreExecutors$2.execute(MoreExecutors.java:849) at com.google.common.util.concurrent.AbstractIdleService$2.doStart(AbstractIdleService.java:50) at com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:174) at com.google.common.util.concurrent.AbstractIdleService.startAsync(AbstractIdleService.java:135) at com.amazon.kinesis.streaming.agent.Agent.main(Unknown Source) Caused by: java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:158) at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.findCurrentOpenFileAfterTruncate(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.updateCurrentFile(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.initialize(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.FileTailer.initialize(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.FileTailer.startUp(Unknown Source) at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:55) at com.google.common.util.concurrent.Callables$3.run(Callables.java:95) at java.lang.Thread.run(Thread.java:748) 2018-08-29 07:35:29.264+0000 ip-10-4-1-47 (FileTailer[fh:spotify:/var/log/sockd/sockd.log]) com.amazon.kinesis.streaming.agent.Agent [ERROR] FATAL: Thread FileTailer[fh:spotify:/var/log/sockd/sockd.log] threw an unrecoverable error. Aborting application java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:158) at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.findCurrentOpenFileAfterTruncate(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.updateCurrentFile(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.initialize(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.FileTailer.initialize(Unknown Source) at com.amazon.kinesis.streaming.agent.tailing.FileTailer.startUp(Unknown Source) at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:55) at com.google.common.util.concurrent.Callables$3.run(Callables.java:95) at java.lang.Thread.run(Thread.java:748) (END)
Same thing happened to me. I moved the checkpoints file in /var/run/aws-kinesis-agent/checkpoints and let kinesis agent recreate it. This solved the problem.
I have a set of steps that make this issue appear every time. Using a tool to continually write to a temp file (/tmp/log_events.json
) that is being successfully monitored by aws-kinesis-agent, run the following commands.
truncate -s 0 /tmp/log_events.json
truncate -s 0 /tmp/log_events.json
systemctl restart aws-kinesis-agent
tail -f /var/log/aws-kinesis-agent/aws-kinesis-agent.log
Commenting out two lines will fix this exception. The assert checks are not really needed, as the code runs the same check later:
index 1a9f3b1..e03df74 100644
--- a/src/com/amazon/kinesis/streaming/agent/tailing/TrackedFileRotationAnalyzer.java
+++ b/src/com/amazon/kinesis/streaming/agent/tailing/TrackedFileRotationAnalyzer.java
@@ -160,8 +160,8 @@ public class TrackedFileRotationAnalyzer {
*/
public int findCurrentOpenFileAfterTruncate() {
Preconditions.checkNotNull(currentOpenFile);
- Preconditions.checkState(incoming.size() > 1); // always true after rotation by Truncate
- Preconditions.checkState(hasCounterpart(incoming.get(0))); // always true for rotation by Truncate
+ // Preconditions.checkState(incoming.size() > 1); // always true after rotation by Truncate
+ // Preconditions.checkState(hasCounterpart(incoming.get(0))); // always true for rotation by Truncate
// In case of rotation by Truncate, the current open file
// MUST have a counterpart: either the latest file, or a previously
// rotated file. If not, this is an anomaly (could happen if