bunyan-cloudwatch icon indicating copy to clipboard operation
bunyan-cloudwatch copied to clipboard

Data Already Accepted Exception

Open clocked0ne opened this issue 8 years ago • 4 comments

I can find very little information on this error in the AWS documentation and do not know the root cause, so for now I have forked the repo and used a method similar to https://github.com/mirkokiefer/bunyan-cloudwatch/pull/2/files to handle the error: DataAlreadyAcceptedException: The given batch of log events has already been accepted. The next batch can be sent with sequenceToken: 49567539988883386692933872174268000661736578965150110610

The problem is that this appears to halt all further sending of logs to Cloudwatch until the app is restarted.

Can you suggest a cause or solution to this, as it does not appear that bunyan-cloudwatch currently receives this as a retryable error?

clocked0ne avatar Dec 13 '16 13:12 clocked0ne

I was able to replicate the DataAlreadyAcceptedException by calling the PutLogEvents API twice using the same sequence token. The message properties need to be exactly the same otherwise it returns a InvalidSequenceTokenException error, but interestingly the timestamp properties can be different. If the first call is successful, the second request should respond with the DataAlreadyAcceptedException with retryable as false.

amekkawi avatar May 14 '17 23:05 amekkawi

The issue is quite old and we have been happily using our forked version of the module for some time therefore I have not looked into this issue for some time, so please forgive me if this response seems a bit naive but given that it is probably possible for properties to be exactly the same for a message, surely if the timestamp is different it should be treated as a new message? In any case, I feel like there should be a method or optional flag to choose whether to make these responses retryable using the new sequenceToken.

clocked0ne avatar May 15 '17 09:05 clocked0ne

@clocked0ne Just took a quick look at the commits in your fork. Has your change resulted in duplicates?

I did a bit more testing and I think what happens is 1) there is a network issue while receiving the success response from the AWS API; 2) the AWS SDK automatically resends the request (up to 4-5 times) on network errors; 3) the AWS API responds with DataAlreadyAcceptedException.

If this is the case, I believe it would be safe for bunyan-cloudwatch to ignore the error and not resend the log events.

The only caveat would be that it would be better if the messages still retained the timestamp (see #17), since it would decrease the chance of messages matching by chance if multiple emitters are concurrently putting log events to the same LogStream.

amekkawi avatar May 15 '17 11:05 amekkawi

Hi @amekkawi Thanks for taking the time to look at this, I think that seems like a sensible conclusion - these errors should probably be ignored if the timestamp is retained and used to cross-correlate the potentially duplicated request.

As I mentioned, I haven't looked at this in some time so couldn't categorically state if we have had any duplicates as a result, I will see if I can find time to investigate our Cloudwatch logs for any possible duplicates based on your reasonable assessment of what is happening.

clocked0ne avatar May 15 '17 13:05 clocked0ne