bunyan-cloudwatch
bunyan-cloudwatch copied to clipboard
Data Already Accepted Exception
I can find very little information on this error in the AWS documentation and do not know the root cause, so for now I have forked the repo and used a method similar to https://github.com/mirkokiefer/bunyan-cloudwatch/pull/2/files to handle the error:
DataAlreadyAcceptedException: The given batch of log events has already been accepted. The next batch can be sent with sequenceToken: 49567539988883386692933872174268000661736578965150110610
The problem is that this appears to halt all further sending of logs to Cloudwatch until the app is restarted.
Can you suggest a cause or solution to this, as it does not appear that bunyan-cloudwatch currently receives this as a retryable
error?
I was able to replicate the DataAlreadyAcceptedException by calling the PutLogEvents API twice using the same sequence token. The message
properties need to be exactly the same otherwise it returns a InvalidSequenceTokenException error, but interestingly the timestamp
properties can be different. If the first call is successful, the second request should respond with the DataAlreadyAcceptedException with retryable
as false
.
The issue is quite old and we have been happily using our forked version of the module for some time therefore I have not looked into this issue for some time, so please forgive me if this response seems a bit naive but given that it is probably possible for properties to be exactly the same for a message, surely if the timestamp
is different it should be treated as a new message? In any case, I feel like there should be a method or optional flag to choose whether to make these responses retryable
using the new sequenceToken
.
@clocked0ne Just took a quick look at the commits in your fork. Has your change resulted in duplicates?
I did a bit more testing and I think what happens is 1) there is a network issue while receiving the success response from the AWS API; 2) the AWS SDK automatically resends the request (up to 4-5 times) on network errors; 3) the AWS API responds with DataAlreadyAcceptedException.
If this is the case, I believe it would be safe for bunyan-cloudwatch to ignore the error and not resend the log events.
The only caveat would be that it would be better if the messages still retained the timestamp (see #17), since it would decrease the chance of messages matching by chance if multiple emitters are concurrently putting log events to the same LogStream.
Hi @amekkawi Thanks for taking the time to look at this, I think that seems like a sensible conclusion - these errors should probably be ignored if the timestamp is retained and used to cross-correlate the potentially duplicated request.
As I mentioned, I haven't looked at this in some time so couldn't categorically state if we have had any duplicates as a result, I will see if I can find time to investigate our Cloudwatch logs for any possible duplicates based on your reasonable assessment of what is happening.