logstash-logger
logstash-logger copied to clipboard
Duplicate massages
Hi,
As I understand from docs, duplicate messages can happen for outputs that write in batches, like Kafka. I'm using UDP output, and I still see lot's of duplicates.
Might be related, when I see duplicates I also see the following error:
ERROR -- : [LogStashLogger::Device::UDP] Errno::EMSGSIZE - Message too large
What could be the reason?
The UDP device also writes in batches because Ruby writes to sockets in batches. If you're seeing lots of duplicates, that means Ruby thought it couldn't write to the UDP socket and therefore threw an exception, and then LogStashLogger retried writing the whole batch. But in fact, the UDP packets were received on the listening side.
Errno::EMSGSIZE
is a known issue with the UDP device which is documented in the troubleshooting section of the Readme
I'm confused with your answer. First I didn't see any exceptions regarding Ruby can't write to UDP socket. Secondly, even when I enabled autoflush
mode, which effectively disables batch mode, I still get duplicates. Third, according to code, the Errno::EMSGSIZE
is rescued and considered as unrecoverable error, therefore ignored and not causing retries.
Am I missing something?
Logging exceptions are written to STDERR by default. Are you checking there?
autoflush
doesn't disable batching. It flushes the buffer after each log message is written. Log messages can still accumulate in the buffer if a flush takes a while. Anyway, even a single log message could be retried several times.
Errno::EMSGSIZE
does indeed not cause retries. It's most likely a separate issue unrelated to the duplicate log messages.
I'll do some more research to see if it's normal/expected to get duplicate UDP packets on the receiving end even if the client only sent it once. I suspect that it is.
Yes I'm checking STDERR
. The only error I see there is Errno::EMSGSIZE
.
The problem that I can get thousands duplicate messages. I mean one message can get written thousand times. I did tcpdump
and indeed those duplicate messages are coming out from the ruby (rails) process.
I know Errno::EMSGSIZE
should not cause duplicates, but most of the time when I see duplicates I see a huge amount of Errno::EMSGSIZE
messages in STDERR
That is indeed strange. It does appear that Errno::EMSGSIZE
is not being correctly treated as unrecoverable. What version of LogStashLogger are you using? Dropping unrecoverable errors was released in 0.16.0.
I use 0.22.1
version
I tried a small test, and got the result I expected:
logger = LogStashLogger.new(type: :udp, port: 5228)
loop { 10.times { logger.info(Time.now); sleep(0.1) }; logger.info('1'*100_000); }
E, [2017-05-13T14:55:40.690990 #16663] ERROR -- : [LogStashLogger::Device::UDP] Errno::EMSGSIZE - Message too long
W, [2017-05-13T14:55:40.691039 #16663] WARN -- : [LogStashLogger::Device::UDP] unrecoverable error, aborting write
I was not able to reproduce duplicate log messages being sent. Long messages were correctly discarded.
Can you post a few samples of the duplicate log messages?
What version of Ruby are you using? In MRI 2.4.1, I get the error message Message too long
, not Message too large
Interesting... I just tried this test under JRuby and got E, [2017-05-13T15:12:57.713000 #17281] ERROR -- : [LogStashLogger::Device::UDP] SystemCallError - Unknown error - No message available
. So it's not caught as an unrecoverable error and the whole batch is retried forever.
Looks like version-specific error handling will be required.
I'm using MRI 2.3.3
. And strangely just see the message with the error, no warning like in your output (WARN -- : [LogStashLogger::Device::UDP] unrecoverable error, aborting write
)
I can't reproduce this behavior in MRI 2.3.3. I suspect the behavior might be platform specific. What OS are you using? I test on a Macbook Pro.
I run it in Docker container (Alpine Linux) on top of Debian distribution.
I ran the same test and I see that the message always get retried:
E, [2017-05-13T22:49:38.210640 #13] ERROR -- : [LogStashLogger::Device::UDP] Errno::EMSGSIZE - Message too large
W, [2017-05-13T22:49:38.210715 #13] WARN -- : [LogStashLogger::Device::UDP] unrecoverable error, aborting write
E, [2017-05-13T22:49:38.210878 #13] ERROR -- : [LogStashLogger::Device::UDP] Errno::EMSGSIZE - Message too large
W, [2017-05-13T22:49:38.210946 #13] WARN -- : [LogStashLogger::Device::UDP] unrecoverable error, aborting write
E, [2017-05-13T22:49:38.211094 #13] ERROR -- : [LogStashLogger::Device::UDP] Errno::EMSGSIZE - Message too large
W, [2017-05-13T22:49:38.211126 #13] WARN -- : [LogStashLogger::Device::UDP] unrecoverable error, aborting write
W, [2017-05-13T22:49:42.780342 #13] WARN -- : [LogStashLogger::Device::UDP] Buffer Full - {:pending=>50, :outgoing=>0}
W, [2017-05-13T22:49:47.804145 #13] WARN -- : [LogStashLogger::Device::UDP] Buffer Full - {:pending=>50, :outgoing=>0}
W, [2017-05-13T22:49:52.825159 #13] WARN -- : [LogStashLogger::Device::UDP] Buffer Full - {:pending=>50, :outgoing=>0}
The test above writes multiple large messages. If you just write one message, you'll see that it fails and never retries again. The unrecoverable error, aborting write
warning is the indicator that it will drop the message and not retry.
It looks to me like you're getting the expected behavior when running my small test, but unexpected behavior in your actual web app. Do you have any extra configuration, gems, etc. that might contribute to the issue?
Here are the configuration I use:
config.logstash.max_message_size = 4096
config.logstash.buffer_max_items = 8192
config.logstash.buffer_max_interval = 1
Also, I'm using lograge