winston-loki icon indicating copy to clipboard operation
winston-loki copied to clipboard

the logs are not sent after "something" happened.

Open cragia opened this issue 3 years ago • 4 comments

This is more an open question than an issue...

I have a couple of services that use winston-loki as transport to send the logs data to Loki. It's configured to send logs as batch (as default).

I noticed that sometimes (in moments where one of the services is a little bit "stressed" and received many messages) the transport simply stops to send logs to Loki, and therefore, I find no logs in there. The service is continuing to do stuff, as I can see that the Console transport of winston is still working, and displaying correctly things on the console. The problem is that it never sends a log again, so I've lost even days of logs...

So my question is: what can be the reasons why the batching stopped sending logs? And when it stopped, how can I automatically make it restart sending the logs that it has stacked until that moment?

thank you, Giacomo.

cragia avatar Aug 04 '21 09:08 cragia

Hi! Sorry for not answering earlier. I have a hunch that the issue is probably "caused" by the latest patch, but has been there for a while, just in another form. This is probably fixable by introducing a new queue for serialized logs ready for sending.

JaniAnttonen avatar Aug 09 '21 13:08 JaniAnttonen

A fix that works as of now is to switch from protobuf to JSON / disable batching for protobuf.

JaniAnttonen avatar Aug 09 '21 13:08 JaniAnttonen

I already use JSON... this is my configuration: { level: 'info', json: true, host: process.env.LOKI_URL || 'http://loki:3100', labels: { service: '***', pod: process.env.POD_NAME || 'pod', }, }

What else could I do?

cragia avatar Sep 10 '21 06:09 cragia

Hi, I think I found the cause. For me, the "something" that breaks it, is a loki restart. And the cause is that the prepareJsonBatch does the changes in the original batch (instead of creating a new prepared json). If the message is not sent, but also clearOnError was not set, you have a batch object which has a mix of entries: image Some are the input for prepareJsonBatch, some are the output of prepareJsonBatch. I'll see if I can make a PR for this

jonim8or avatar Dec 21 '21 12:12 jonim8or